Combinatorial Synthesis: Scaling Code RLVR via Atomic Decomposition and Recombination Paper • 2605.31058 • Published 8 days ago • 1
Combinatorial Synthesis: Scaling Code RLVR via Atomic Decomposition and Recombination Paper • 2605.31058 • Published 8 days ago • 1
LiteCoder-Terminal: Scaling Long-Horizon Terminal Environments for Learning Language Agents Paper • 2605.29559 • Published 9 days ago • 16
Learning from Failures: Correction-Oriented Policy Optimization with Verifiable Rewards Paper • 2605.14539 • Published 23 days ago • 5
Learning from Failures: Correction-Oriented Policy Optimization with Verifiable Rewards Paper • 2605.14539 • Published 23 days ago • 5
Beyond Text-Dominance: Understanding Modality Preference of Omni-modal Large Language Models Paper • 2604.16902 • Published Apr 18 • 6
Beyond Text-Dominance: Understanding Modality Preference of Omni-modal Large Language Models Paper • 2604.16902 • Published Apr 18 • 6
Towards Real-world Human Behavior Simulation: Benchmarking Large Language Models on Long-horizon, Cross-scenario, Heterogeneous Behavior Traces Paper • 2604.08362 • Published Apr 9 • 16
Towards Real-world Human Behavior Simulation: Benchmarking Large Language Models on Long-horizon, Cross-scenario, Heterogeneous Behavior Traces Paper • 2604.08362 • Published Apr 9 • 16
Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards Paper • 2603.09117 • Published Mar 10 • 10
Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards Paper • 2603.09117 • Published Mar 10 • 10
view article Article Announcing ReasoningLens — Visualizing and Diagnosing LLM Reasoning at a Glance Bowieee • Feb 3 • 7
view article Article Announcing ReasoningLens — Visualizing and Diagnosing LLM Reasoning at a Glance Bowieee • Feb 3 • 7