Ankit Dhiman

ankitdhiman

https://0xnktd.github.io

AI & ML interests

CV, NLP, Speech Anonymization

Recent Activity

upvoted a collection about 22 hours ago

Nemotron Code & SWE

updated a Space 2 months ago

ankitdhiman/dalaal-env

published a Space 2 months ago

ankitdhiman/dalaal-env

View all activity

Organizations

upvoted a collection about 22 hours ago

Nemotron Code & SWE

Collection

Datasets for building models that write, debug, and reason about code. Covers competitive programming, software engineering, and code pretraining. • 14 items • Updated about 22 hours ago • 5

updated a Space 2 months ago

Dalaal Env

🌐

Run browser‑based RL tasks via API for LLM agents

published a Space 2 months ago

Dalaal Env

🌐

Run browser‑based RL tasks via API for LLM agents

liked a dataset 3 months ago

nvidia/Retrieval-Synthetic-NVDocs-v1

Viewer • Updated Mar 30 • 15.1k • 472 • 18

liked 2 models 5 months ago

YatharthS/MiraTTS

Text-to-Speech • 0.5B • Updated Dec 24, 2025 • 276 • 188

nvidia/magpie_tts_multilingual_357m

Text-to-Speech • Updated May 3 • 1.08k • 138

updated a model 6 months ago

ankitdhiman/nero

0.4B • Updated Dec 12, 2025

reacted to tomaarsen's post with 🔥 6 months ago

Post

4473

🐦‍🔥 I've just published Sentence Transformers v5.2.0! It introduces multi-processing for CrossEncoder (rerankers), multilingual NanoBEIR evaluators, similarity score outputs in mine_hard_negatives, Transformers v5 support and more. Details:

- CrossEncoder multi-processing: Similar to SentenceTransformer and SparseEncoder, you can now use multi-processing with CrossEncoder rerankers. Useful for multi-GPU and CPU settings, and simple to configure: just device=["cuda:0", "cuda:1"] or device=["cpu"]*4 on the model.predict or model.rank calls.

- Multilingual NanoBEIR Support: You can now use community translations of the tiny NanoBEIR retrieval benchmark instead of only the English one, by passing dataset_id, e.g. dataset_id="lightonai/NanoBEIR-de" for the German benchmark.

- Similarity scores in Hard Negatives Mining: When mining for hard negatives to create a strong training dataset, you can now pass output_scores=True to get similarity scores returned. This can be useful for some distillation losses!

- Transformers v5: This release works with both Transformers v4 and the upcoming v5. In the future, Sentence Transformers will only work with Transformers v5, but not yet!

- Python 3.9 deprecation: Now that Python 3.9 has lost security support, Sentence Transformers no longer supports it.

Check out the full changelog for more details: https://github.com/huggingface/sentence-transformers/releases/tag/v5.2.0

I'm quite excited about what's coming. There's a huge draft PR with a notable refactor in the works that should bring some exciting support. Specifically, better multimodality, rerankers, and perhaps some late interaction in the future!