CodeDance: A Dynamic Tool-integrated MLLM for Executable Visual Reasoning Paper • 2512.17312 • Published Dec 19, 2025 • 3
Let ViT Speak: Generative Language-Image Pre-training Paper • 2605.00809 • Published 22 days ago • 32
The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer Paper • 2504.10462 • Published Apr 14, 2025 • 15
view article Article Fine-tuning SmolLM with Group Relative Policy Optimization (GRPO) by following the Methodologies prithivMLmods • Feb 17, 2025 • 29
view article Article Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO) ariG23498 • Jan 19, 2025 • 50
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation Paper • 2504.08736 • Published Apr 11, 2025 • 46
view article Article We now support VLMs in smolagents! +1 m-ric, merve, albertvillanova • Jan 24, 2025 • 113