Domino: Decoupling Causal Modeling from Autoregressive Drafting in Speculative Decoding Paper • 2605.29707 • Published 7 days ago • 133
LiteCoder-Terminal: Scaling Long-Horizon Terminal Environments for Learning Language Agents Paper • 2605.29559 • Published 7 days ago • 15
MolmoAct2: Action Reasoning Models for Real-world Deployment Paper • 2605.02881 • Published May 4 • 348
GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment Paper • 2605.19577 • Published 16 days ago • 58
Draft Less, Retrieve More: Hybrid Tree Construction for Speculative Decoding Paper • 2605.20104 • Published 16 days ago • 7
Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling Paper • 2605.13301 • Published 22 days ago • 159
RigidFormer: Learning Rigid Dynamics using Transformers Paper • 2605.09196 • Published 26 days ago • 14
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Image-Text-to-Text • 28B • Updated Apr 6 • 163k • • 2.87k
Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs? Paper • 2603.24472 • Published Mar 25 • 57
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation Paper • 2603.19220 • Published Mar 19 • 69
CoBA-RL: Capability-Oriented Budget Allocation for Reinforcement Learning in LLMs Paper • 2602.03048 • Published Feb 3 • 32
Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning Paper • 2602.01058 • Published Feb 1 • 45