view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency Jan 30, 2025 • 244
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 • 306
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 12 items • Updated 3 days ago • 195
view article Article Welcome EmbeddingGemma, Google's new efficient embedding model +4 Sep 4, 2025 • 273
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM +2 Mar 12, 2025 • 490
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions Paper • 2412.09596 • Published Dec 12, 2024 • 97