Token Budgets: An Empirical Catalog of 63 LLM-Agent Budget-Overrun Incidents, with an Affine-Typed Rust Mitigation as a Case Study
Abstract
LLM-agent budget overruns are a documented production failure class: a single retry loop can spend thousands of dollars before an operator notices, and the in-process integrity properties that would prevent it (no aliasing, no double-spend, no use-after-delegation of a cost-bearing value) are enforced, if at all, by ad-hoc wrappers rather than by the type system. Our central contribution is empirical: a catalog of 63 confirmed production incidents from 21 orchestration frameworks (2023-2026), each backed by a quoted GitHub issue and, where reported, a dollar loss, organized into an eight-cluster failure taxonomy (inter-rater Cohen's kappa = 0.837, N = 113), plus 47 supplementary structural entries. As one mitigation evaluated against this taxonomy, we build token-budgets, an 1,180-line Rust crate (no unsafe) that operationalizes affine ownership so that cloning, double-spending, or using a budget after delegating it are compile errors rather than runtime hazards an operator must remember to avoid. The dollar cap is runtime arithmetic under an estimator assumption; the affine layer makes that arithmetic non-bypassable. On single-agent workloads a 4-line Python counter matches the crate at 0/30 overshoot, so the distinguishing value is non-bypassability under operator error in multi-agent delegation: the delegation-fanout race documented in 11 incidents is rejected by the borrow checker at compile time, while the same pattern under asyncio overshoots 30/30 and three disciplined alternatives overshoot 0/30. Across five runtimes, three providers, and a temperature-stratified live-API test (N = 160), the approach reports zero cap violations and zero false refusals, at operational parity with concurrent work. Static over-reservation is 4-6x (2.11x adaptive). Binary-level cap-soundness on the running binary is left open.
Community
The core contribution is empirical, not the Rust crate: a catalog of 63 confirmed LLM-agent budget-overrun incidents across 21 orchestration frameworks (2023–2026), each backed by a quoted GitHub issue and (where reported) a dollar loss, classified at two-rater κ = 0.837 (N = 113). The crate is one case-study mitigation — affine ownership making clone/double-spend/use-after-delegation compile errors. Honest finding: on single-agent workloads a 4-line Python counter ties it 0/30; the affine type only pulls ahead under operator error in multi-agent delegation (the fan-out race is rejected by the borrow checker, while asyncio overshoots 30/30). Binary-level cap-soundness is left open. Full artifact (catalog CSV, crate, proofs, reproduce.sh): https://github.com/sajjadanwar0/token-budgets
Feedback on the taxonomy welcome.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Iterative Audit Convergence in LLM-Managed Multi-Agent Systems: A Case Study in Prompt Engineering Quality Assurance (2026)
- Architectural Obsolescence of Unhardened Agentic-AI Runtimes (2026)
- S-Bus: Automatic Read-Set Reconstruction for Multi-Agent LLM State Coordination (2026)
- Beyond Code Reasoning: Specification-Anchored Auditing of Multi-Implementation Distributed Protocols (2026)
- When Safe Skills Collide: Measuring Compositional Risk in Agent Skill Ecosystems (2026)
- Agentic Model Checking (2026)
- Skills as Verifiable Artifacts: A Trust Schema and a Biconditional Correctness Criterion for Human-in-the-Loop Agent Runtimes (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2606.04056 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper