Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data Paper • 2602.21320 • Published 8 days ago • 10
Efficient RLVR Training via Weighted Mutual Information Data Selection Paper • 2603.01907 • Published 3 days ago • 14
CHIMERA: Compact Synthetic Data for Generalizable LLM Reasoning Paper • 2603.00889 • Published 4 days ago • 41
SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale Paper • 2602.23866 • Published 6 days ago • 55
Code2Math: Can Your Code Agent Effectively Evolve Math Problems Through Exploration? Paper • 2603.03202 • Published 1 day ago • 5
Surgical Post-Training: Cutting Errors, Keeping Knowledge Paper • 2603.01683 • Published 3 days ago • 9
How Controllable Are Large Language Models? A Unified Evaluation across Behavioral Granularities Paper • 2603.02578 • Published 2 days ago • 21
BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing? Paper • 2603.03194 • Published 1 day ago • 48
Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published 1 day ago • 53
Learn Hard Problems During RL with Reference Guided Fine-tuning Paper • 2603.01223 • Published 3 days ago • 12
Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization Paper • 2602.23008 • Published 7 days ago • 34
CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation Paper • 2602.24286 • Published 5 days ago • 70
Discovering Multiagent Learning Algorithms with Large Language Models Paper • 2602.16928 • Published 14 days ago • 16
"What Are You Doing?": Effects of Intermediate Feedback from Agentic LLM In-Car Assistants During Multi-Step Processing Paper • 2602.15569 • Published 16 days ago • 13