arxiv:2604.03295

Scaling Teams or Scaling Time? Memory Enabled Lifelong Learning in LLM Multi-Agent Systems

Published on Mar 27

· Submitted by

Shanglin Wu on Apr 7

Emory University

Upvote

Authors:

Abstract

LLM multi-agent systems exhibit non-monotonic scaling behavior where memory design significantly impacts long-term performance, with smaller teams sometimes outperforming larger ones when experience reuse is optimized.

AI-generated summary

Large language model (LLM) multi-agent systems can scale along two distinct dimensions: by increasing the number of agents and by improving through accumulated experience over time. Although prior work has studied these dimensions separately, their interaction under realistic cost constraints remains unclear. In this paper, we introduce a conceptual scaling view of multi-agent systems that jointly considers team size and lifelong learning ability, and we study how memory design shares this landscape. To this end, we propose LLMA-Mem, a lifelong memory framework for LLM multi-agent systems under flexible memory topologies. We evaluate LLMA-Mem on MultiAgentBench across coding, research, and database environments. Empirically, LLMA-Mem consistently improves long-horizon performance over baselines while reducing cost. Our analysis further reveals a non-monotonic scaling landscape: larger teams do not always produce better long-term performance, and smaller teams can outperform larger ones when memory better supports the reuse of experience. These findings position memory design as a practical path for scaling multi-agent systems more effectively and more efficiently over time.

View arXiv page View PDF GitHub 3 Add to collection

Community

jasonwu1017

Paper submitter about 6 hours ago

Excited to share our work on the scaling of multi-agent systems! We move beyond just "adding more agents" to jointly study accumulated experience as a second dimension.

The Framework: We introduce LLMA-Mem, a lifelong memory system with flexible topologies (individual vs. shared) for MAS.

Key Insight: We find a non-monotonic scaling landscape—smaller teams with superior memory design often outperform larger, "memory-poor" teams while significantly reducing inference costs.

Code: https://github.com/ShanglinWu/MAS_lifelong_learning