Qwen/Qwen3.5-397B-A17B Image-Text-to-Text • 403B • Updated about 21 hours ago • 1.73M • • 1.34k
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency Jan 30, 2025 • 243
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 • 306
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 Text Generation • 32B • Updated 1 day ago • 925k • 669
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 12 items • Updated 4 days ago • 200
baidu/ERNIE-4.5-VL-28B-A3B-Thinking Image-Text-to-Text • 30B • Updated 10 days ago • 1.18k • 523
google/embeddinggemma-300m Sentence Similarity • 0.3B • Updated Sep 25, 2025 • 1.93M • • 1.52k
Qwen/Qwen3-30B-A3B-Instruct-2507-FP8 Text Generation • 31B • Updated Sep 17, 2025 • 607k • 115