view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 aminediroHF, qgallouedec, kashif, lewtun, edbeeching, albertvillanova, nouamanetazi, lvwerra, sergiopaniego • Mar 10 • 160
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published Mar 25 • 98
EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings Paper • 2603.13594 • Published Mar 13 • 149
view article Article Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance ServiceNow-AI • Dec 9, 2025 • 84
view article Article Smol2Operator: Post-Training GUI Agents for Computer Use +3 A-Mahla, merve, sergiopaniego, reach-vb, lewtun • Sep 23, 2025 • 138
view article Article Gaia2 and ARE: Empowering the community to study agents +9 clefourrier, gregmialz, mlcu, mortimerp9, XciD, tfrere, evijit, RomainFroger, dheeraj7596, CarolinePascal, upiter • Sep 22, 2025 • 135
AU-Harness: An Open-Source Toolkit for Holistic Evaluation of Audio LLMs Paper • 2509.08031 • Published Sep 9, 2025 • 21
view article Article SmolLM3: smol, multilingual, long-context reasoner +21 eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf • Jul 8, 2025 • 778
MiniMax-M1 Collection MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. • 6 items • Updated Apr 15 • 119
General-Reasoner Collection Advancing LLMs' general reasoning capabilities • 9 items • Updated Oct 12, 2025 • 6
view article Article Selective fine-tuning of Language Models with Spectrum anakin87 • Sep 3, 2024 • 36
view article Article The N Implementation Details of RLHF with PPO +1 vwxyzjn, tianlinliu0121, lvwerra • Oct 24, 2023 • 72
view article Article BigCodeBench: The Next Generation of HumanEval +7 terryyz, ganler, SivilTaram, huybery, Muennighoff, dpfried, harmdevries, lvwerra, clefourrier • Jun 18, 2024 • 54
[lecture artifacts] aligning open language models Collection artifacts referenced in the talk timeline! Slides: https://docs.google.com/presentation/d/1quMyI4BAx4rvcDfk8jjv063bmHg4RxZd9mhQloXpMn0/edit?usp=sharin • 63 items • Updated Apr 17, 2024 • 58