Hugging Face Internal Testing Organization

company

AI & ML interests

None defined yet.

Recent Activity

hf-transformers-bot updated a dataset about 3 hours ago

hf-internal-testing/transformers_ci_push

hf-transformers-bot updated a dataset about 9 hours ago

hf-internal-testing/transformers_daily_ci_with_torch_nightly

nielsr submitted a paper about 10 hours ago

A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens

View all activity

hf-transformers-bot

updated a dataset about 3 hours ago

hf-internal-testing/transformers_ci_push

Updated about 3 hours ago • 372

posted an update about 5 hours ago

Post

50

🌐 I've just published Sentence Transformers v5.4 to make the project fully multimodal for embeddings and reranking. The release also includes a modular CrossEncoder, and automatic Flash Attention 2 input flattening. Details:

You can now use SentenceTransformer and CrossEncoder with text, images, audio, and video, with the same familiar API. That means you can compute embeddings for an image and a text query using model.encode(), compare them with model.similarity(), and it just works. Models like Qwen3-VL-Embedding-2B and jinaai/jina-reranker-m0 are supported out of the box.

Beyond multimodal, I also fully modularized the CrossEncoder class. It's now a torch.nn.Sequential of composable modules, just like SentenceTransformer has been. This unlocked support for generative rerankers (CausalLM-based models like mxbai-rerank-v2 and the Qwen3 rerankers) via a new LogitScore module, which wasn't possible before without custom code.

Also, Flash Attention 2 now automatically skips padding for text-only inputs. If your batch has a mix of short and long texts, this gives you a nice speedup and lower VRAM usage for free.

I wrote a blog post walking through the multimodal features with practical examples. Check it out if you want to get started, or just point your Agent to the URL: https://huggingface.co/blog/multimodal-sentence-transformers

This release has set up the groundwork for more easily introducing late-interaction models (both text-only and multimodal) into Sentence Transformers in the next major release. I'm looking forward to it!

hf-transformers-bot

updated a dataset about 9 hours ago

hf-internal-testing/transformers_daily_ci_with_torch_nightly

Updated about 9 hours ago • 242

submitted a paper to Daily Papers about 10 hours ago

A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens

Paper • 2604.04913 • Published 4 days ago • 2

updated a dataset about 13 hours ago

hf-internal-testing/tokenizers-bench

Viewer • Updated about 13 hours ago • 84 • 2

hf-transformers-bot

updated a dataset about 14 hours ago

hf-internal-testing/transformers_flash_attn_ci

Updated about 14 hours ago • 432

hf-transformers-bot

updated a dataset about 15 hours ago

hf-internal-testing/transformers_daily_ci

Updated about 11 hours ago • 4.1k • 3

published a dataset about 21 hours ago

hf-internal-testing/tokenizers-bench

Viewer • Updated about 13 hours ago • 84 • 2

updated a dataset about 21 hours ago

hf-internal-testing/tokenizers-bench-data

Updated about 22 hours ago • 9

published a dataset about 22 hours ago

hf-internal-testing/tokenizers-bench-data

Updated about 22 hours ago • 9

hf-transformers-bot

updated a dataset 1 day ago

hf-internal-testing/transformers_pr_ci

Updated about 23 hours ago • 1.79k

submitted a paper to Daily Papers 6 days ago

MDPBench: A Benchmark for Multilingual Document Parsing in Real-World Scenarios

Paper • 2603.28130 • Published 10 days ago • 10

hf-transformers-bot

updated a dataset 6 days ago

hf-internal-testing/transformers_doc_ci

Updated 6 days ago • 31

hf-transformers-bot

published a dataset 6 days ago

hf-internal-testing/transformers_doc_ci

Updated 6 days ago • 31

updated a dataset 9 days ago

hf-internal-testing/tokenizers-test-data

Viewer • Updated 9 days ago • 22 • 632

published a dataset 9 days ago

hf-internal-testing/tokenizers-test-data

Viewer • Updated 9 days ago • 22 • 632

posted an update 14 days ago

Post

425

I like these models nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16 and nvidia/NVIDIA-Nemotron-3-Nano-4B-FP8 and TradingAgents: Multi-Agents LLM Financial Trading Framework (2412.20138) and https://arxiv.org/abs/2412.20138

mlabonne/FineTome-100k

submitted a paper to Daily Papers 17 days ago

Do VLMs Need Vision Transformers? Evaluating State Space Models as Vision Encoders

Paper • 2603.19209 • Published 21 days ago • 5

submitted a paper to Daily Papers 21 days ago

V-JEPA 2.1: Unlocking Dense Features in Video Self-Supervised Learning

Paper • 2603.14482 • Published 25 days ago • 28

submitted a paper to Daily Papers 22 days ago

Omnilingual MT: Machine Translation for 1,600 Languages

Paper • 2603.16309 • Published 23 days ago • 20