P1: Mastering Physics Olympiads with Reinforcement Learning Paper β’ 2511.13612 β’ Published Nov 17, 2025 β’ 134
view article Article Introducing smolagents: simple agents that write actions in code. +1 m-ric, merve, thomwolf β’ Dec 31, 2024 β’ 1.19k
Kimi-K2 Collection Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence β’ 5 items β’ Updated Jan 27 β’ 173
view article Article π€ππ¬π₯οΈπ Kimi-VL-A3B-Thinking-2506: A Quick Navigation moonshotai β’ Jun 21, 2025 β’ 77
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper β’ 2506.13585 β’ Published Jun 16, 2025 β’ 276
AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale Paper β’ 2505.08311 β’ Published May 13, 2025 β’ 19
AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset Paper β’ 2504.16891 β’ Published Apr 23, 2025 β’ 27
Kimi k1.5: Scaling Reinforcement Learning with LLMs Paper β’ 2501.12599 β’ Published Jan 22, 2025 β’ 130
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? Paper β’ 2411.16489 β’ Published Nov 25, 2024 β’ 45
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper β’ 2412.06559 β’ Published Dec 9, 2024 β’ 86
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator Paper β’ 2412.12094 β’ Published Dec 16, 2024 β’ 11
NuminaMath Collection Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize β’ 7 items β’ Updated Feb 10, 2025 β’ 81
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies Paper β’ 2407.13623 β’ Published Jul 18, 2024 β’ 56
view article Article RegMix: Data Mixture as Regression for Language Model Pre-training SivilTaram β’ Jul 11, 2024 β’ 15