Getting Up to Speed on Multi-Agent Systems, Part 2: The Vocabulary

25 Apr 2026

If you try to read multi-agent systems papers without the vocabulary, you will get nowhere. The field has settled on a shared set of words for the pieces of a system, and every paper now slots into those categories even when it pretends to be doing something novel. This post is about those words. Once you know them, you can read any paper in the field and know what it is and isn’t claiming.

Getting Up to Speed on MAS

Part 1. The Landscape
Part 2. The Vocabulary (you are here)
Part 3. Wave 1: Can Agents Coordinate At All?
Part 4. Wave 2: Why It Breaks
Part 5. Debate, State, and Coordination
Part 6. Verification Patterns
Part 7. Benchmarks and What They Miss
Part 8. Open Questions

Three surveys have done the work of consolidating the vocabulary. Each one cuts the space slightly differently, but together they give you the conceptual toolkit.

Tran et al.: Actors, Types, Structures, Strategies

The most useful single survey is Tran et al. (2025). It defines a multi-agent system formally as a tuple of agents, collaboration channels, collective goals, and an environment. Then it taxonomizes the space along four axes.

Tran's Four Axes

Types

Cooperation (aligned goals) Competition (conflicting goals) Coopetition (mixed)

Structures

Centralized (hub) Decentralized (P2P) Hierarchical (layered)

Strategies

Rule-based (voting, consensus) Role-based (SOP, expertise) Model-based (Theory of Mind)

Architecture

Static (pre-defined) Dynamic (runtime adjustment)

Most of the famous wave-1 papers are in one box: cooperative, hierarchical, role-based, static. Everyone is doing roughly the same thing, with small variations in how agents pass messages and what they produce at each step. The survey’s most useful claim is that the optimal structure varies with the task. There is no universal topology.

Zhou et al.: The Five-Component Agent

Zhou et al. (2024) takes a different cut. Instead of asking how agents coordinate, they ask what each agent actually has inside it. They propose a five-component model that applies to any LLM-based agent.

Zhou's Five Components

01 Profile

How the agent is created with role and expertise

02 Perception

How the agent observes its environment

03 Self-Action

Memory, reasoning, and planning

04 Mutual Interaction

Communication paradigm, structure, content

05 Evolution

Self-reflection, progressive enhancement

Reading this as a distributed systems person, the labels sound like things you’d recognize from any actor system. Profile is identity. Perception is input. Self-Action is local state plus computation. Mutual Interaction is message passing. Evolution is the weakest piece, because nobody has really figured out what “agent learning from its own history” looks like in production.

Chen et al.: Applications and Unsolved Challenges

The third survey, Chen et al. (2024), is the one I’d skim rather than read in full. The applications chapter is useful, but what you actually want is the challenges section.

Chen's Challenge Levels

Agent-level

Alignment for simulation Hallucination propagation Long-context limits

Interaction-level

Efficiency explosion Accumulative error

Evaluation-level

No standardized benchmarks No objective metrics No individual vs aggregate frameworks

The interaction-level challenges are the ones that most concern me. Efficiency explosion is the observation that multi-agent systems scale worse than linearly because each agent’s autoregressive generation multiplies the token cost. Accumulative error is what it sounds like: errors made in round one propagate and amplify in rounds two, three, four.

Mapping Papers Into These Taxonomies

The payoff of the vocabulary is that you can now categorize any paper in the field at a glance.

System	Type	Structure	Strategy	Architecture
CAMEL	Cooperation	Decentralized pair	Role-based	Static
ChatDev	Cooperation	Hierarchical pipeline	Role-based	Static
MetaGPT	Cooperation	Centralized pool	Role + Rule-based	Static
Debate (Du)	Competition	Decentralized all-to-all	Rule-based rounds	Static
Generative Agents	Coopetition	Decentralized open env	Model-based retrieval	Dynamic
Anthropic Research	Cooperation	Centralized orchestrator	Role-based	Dynamic
AutoGen	Configurable	Configurable	Configurable	Static or Dynamic

Most of the canonical papers sit in the cooperative, role-based, static quadrant. The interesting ones are the exceptions. Du et al. is the rare competitive debate paper. Generative Agents is the rare fully dynamic system. AutoGen tries to be everything at once, which is its whole thesis.

Why vocabulary matters

When two papers claim they "disagree," the vocabulary lets you ask: are they actually addressing the same problem? ChatDev and MetaGPT both call themselves "multi-agent software engineering frameworks" but they have different structures, different strategies, and different failure modes. You need the words to see that they are solving slightly different versions of the same problem.

The Gap the Vocabulary Exposes

The taxonomies do something else besides categorize papers. They make gaps visible.

Zhou’s “Evolution” component is the weakest across every system. Nobody has a real story for how agents learn from their own history in production. MetaGPT’s “test-driven retry” is the closest wave-1 paper to Evolution, and it’s still just a bounded retry loop with no memory of past attempts.

Tran’s “dynamic architecture” category is almost empty. The wave-1 papers all fix their topology at design time. AutoGen makes topology configurable, but it’s configured by the developer, not adjusted at runtime. The only system that truly adjusts at runtime is Generative Agents, and that’s a simulation, not a production framework.

Chen’s “evaluation-level” challenges are unsolved in a way that’s embarrassing for the field. When ChatDev claims 88 percent executability and MetaGPT claims 41 percent on a comparable benchmark, you’re not looking at a performance difference. You’re looking at two papers measuring different things with different tools and calling them the same.

Next post: the wave-1 theory papers in detail. CAMEL, Generative Agents, ChatDev, MetaGPT, AutoGen. What each one actually builds, what each one trusts, and where each one breaks.

Discussion on Bluesky

Replies to this post on Bluesky appear below. Reply there to join the conversation.

Loading discussion…