Reward Hacking in Reasoning Models Collection Do reasoning LLMs actually reason — or learn to game the test? IPT allows for detecting reward hacking in inductive programming tasks (SLR-Bench). • 4 items • Updated about 9 hours ago • 1
Reward Hacking in Reasoning Models Collection Do reasoning LLMs actually reason — or learn to game the test? IPT allows for detecting reward hacking in inductive programming tasks (SLR-Bench). • 4 items • Updated about 9 hours ago • 1
Scalable Logical Reasoning Collection A collection of scalable logical reasoning tasks • 14 items • Updated about 9 hours ago • 2
Reward Hacking in Reasoning Models Collection Do reasoning LLMs actually reason — or learn to game the test? IPT allows for detecting reward hacking in inductive programming tasks (SLR-Bench). • 4 items • Updated about 9 hours ago • 1
Running Agents 1 Isomorphic Perturbation Testing 🔍 1 Evaluate rule hypotheses for genuine reasoning vs shortcuts
Running Agents 1 SLR-Bench Leaderboard - Reward Hacking in Reasoning Models 🎯 1 Reward shortcut behavior in LLMs via IPT
Running Agents 1 SLR-Bench Leaderboard - Reward Hacking in Reasoning Models 🎯 1 Reward shortcut behavior in LLMs via IPT
Running Agents 1 SLR-Bench Leaderboard - Reward Hacking in Reasoning Models 🎯 1 Reward shortcut behavior in LLMs via IPT