HiL-Bench (Human-in-Loop Benchmark): Do Agents Know When to Ask for Help? Paper • 2604.09408 • Published 14 days ago • 5