On the Entropy Dynamics in Reinforcement Fine-Tuning of Large Language Models Paper • 2602.03392 • Published Feb 3 • 56
Dr. Seg: Revisiting GRPO Training for Visual Large Language Models through Perception-Oriented Design Paper • 2603.00152 • Published 23 days ago • 1