Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning Paper • 2606.04923 • Published 21 days ago • 40
Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning Paper • 2606.04923 • Published 21 days ago • 40
Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning Paper • 2606.04923 • Published 21 days ago • 40
Echoes as Anchors: Probabilistic Costs and Attention Refocusing in LLM Reasoning Paper • 2602.06600 • Published Feb 6 • 3
Echoes as Anchors: Probabilistic Costs and Attention Refocusing in LLM Reasoning Paper • 2602.06600 • Published Feb 6 • 3
Echoes as Anchors: Probabilistic Costs and Attention Refocusing in LLM Reasoning Paper • 2602.06600 • Published Feb 6 • 3