Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 9 days ago • 60
openai/clip-vit-large-patch14 Zero-Shot Image Classification • 0.4B • Updated Sep 15, 2023 • 32.3M • 2.01k
UniSD: Towards a Unified Self-Distillation Framework for Large Language Models Paper • 2605.06597 • Published 14 days ago • 15
Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning Paper • 2605.06130 • Published 14 days ago • 109
OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper • 2605.05185 • Published 15 days ago • 98
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 503
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 629
stefanocarrera/autophagycode_D_train_Qwen3-0.6B_lr0.0001_c142_sem_g4 Viewer • Updated Apr 4 • 103 • 14