AgentDropoutV2: Optimizing Information Flow in Multi-Agent Systems via Test-Time Rectify-or-Reject Pruning Paper • 2602.23258 • Published 15 days ago • 28
Awesome SFT datasets Collection A curated list of interesting datasets to fine-tune language models with. • 41 items • Updated 11 days ago • 148
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published Apr 18, 2025 • 139