4 Architecting and building multi-agent systems
This chapter explains how and why to evolve from single agents to multi-agent systems, emphasizing both the gains in capability and the tradeoffs in cost, latency, and predictability. It centers on three foundational architectures—flow (assembly-line), orchestration (hub-and-spoke), and collaboration (peer-to-peer)—and frames design choices around decision-making (command), control (who executes tools), and communication (how context is shared). A fourth “C,” coordination, shapes how agents progress through work—sequentially, in parallel, hierarchically, or via iterative refinement. The guidance is pragmatic: begin with the simplest effective pattern (typically a flow), then iterate, mixing and adapting as needs grow.
The chapter surveys communication strategies that regulate context exposure and cost, including point-to-point message passing, shared conversational memory, and protocol-based interactions where agents can be invoked like tools. It then maps coordination variants—sequential pipelines, parallel delegation, hierarchical manager–worker plans, and iterative debate/refinement—alongside additional tactics such as voting/ensembles, role-playing, conditional routing, and decentralized peer-to-peer networks. Throughout, it highlights the balance between giving agents enough agency to be effective and constraining them to reduce ambiguity, overhead, and error, encouraging designers to align patterns across a codebase while retaining the flexibility to combine them judiciously.
On the implementation side, the text shows how to transform an overloaded single agent into a specialized agent-to-agent flow to improve focus, determinism, and cost, then introduces explicit, code-level decision points to stabilize outcomes. It demonstrates built-in handoffs that let agents transfer control without manual wiring, ways to visualize and trace flows, and techniques to monitor what is passed between agents. Robustness is increased with guardrails that validate inputs, outputs, and inter-agent transfers—sometimes powered by agents themselves—plus retry loops to recover from failures. While orchestration can centralize planning and delegation, it adds complexity and brittleness; the recommended path is to keep flows simple, strongly type inputs/outputs, limit unnecessary context, add guardrails where appropriate, and graduate to more complex coordination only after establishing reliable flows.
The three well-established patterns for building multiple agent systems, from the agent flow (assembly line), orchestrator (hub-and-spoke), to the collaboration (peer to peer).
Decision-making (command) and control demonstrated for more specialized multi-agent architectures, the flow, orchestration and manager architectures.
Comparison of different agent communication patterns represented in a flow.
Various ways agents may coordinate execution sequentially or in parallel.
A single agent is transformed into a multi-agent flow of agents. The agent role is broken down into three well-defined and distinct roles that encapsulate a well-defined set of tools.
shows an agent-to-agent flow with a deterministic (coded) decision point added. Taking the decision away from the agents and making it deterministic keeps the workflow more consistent.
three agent flow communication patterns demonstrating how agents can pass messages from one another and in turn pass command and control from agent to agent.
There are two ways to visualize the agent flow using draw_graph and the Traces page that can be fournd on the OpenAI Dashboard under logs.
Guardrails can be used to validate and control input and output of an agent.
Traces page after executing the Orchestration agent shows X and Y.
Summary
- Single-agent designs often hit scalability walls; converting a monolithic agent into an agentic flow (a chain of specialised agents) restores clarity, extensibility, and performance.
- Agent-to-agent flows work like prompt-chaining with superpowers: each node can invoke tools and reason independently yet pass concise, typed outputs downstream to keep the context lean.
- Insert deterministic decision points (code or schema checks) wherever the flow must repeat reliably; don’t rely on stochastic LLM judgment for pass/fail branches.
- The OpenAI Agents SDK supports two hand-off styles: conversational (shared thread) and pass-off (explicit code routing). Choose conversational for speed and pass-off for fine-grained control.
- Use the
handoffsfield of the agent plus clear instructional prompts to enable internal hand-offs that require zero extra orchestration code. - Visualise complex flows early using
draw_graph(),and the Dashboard Traces view reveals hidden loops, tool chains, and latency hotspots. - Wrap risky transfers in guardrails—input/output validators that reject, retry, or correct data before it corrupts the flow; tripwires surface as explicit exceptions you can loop on.
- Guardrails themselves can be LLM-powered agents, giving you natural-language policies without brittle regex or length checks—remember they, too, need schemas and tests.
- Tool limits still matter in multi-agent worlds: every registered tool inflates every call; keep each agent’s tool list tight (< 10) and scoped to its role to avoid token bloat.
- Flow, orchestration, and manager-worker are the three canonical decision patterns. Start with plain flows and graduate to orchestrators only when centralised delegation is truly required.
- Choose one communication layer (shared memory, message passing, MCP, or emerging A2A protocol) per project; mixing channels multiplies debugging pain.
- A production-ready agentic flow blends typed I/O, deterministic checkpoints, visual traces, scoped tool sets, and guardrails—yielding pipelines that can scale, recover, and evolve without surprise.
- Agents may be coordinated using multiple different strategies: Sequential Flow, Parallel Delegation, Hierarchical Coordination, Iterative Debate and Refinement, Voting / Best-of-N (Ensemble), Role-Playing Collaboration, Conditional Routing (Branching), and Peer-to-Peer Network
FAQ
When should I move from a single agent to a multi‑agent flow?
Move when a single agent becomes overloaded with tools or long instructions, when specialization would improve quality, or when cost/latency rises due to excess context. Splitting into focused agents clarifies roles, reduces token overhead, and makes behavior easier to reason about. Start with the simplest possible flow and iterate.What are the main multi‑agent architectures and how do they differ?
- Flow (assembly line): Decision‑making and control pass step‑by‑step; communication is point‑to‑point. Simple and predictable.- Orchestrator (hub‑and‑spoke): Central agent decides and delegates; workers execute. Strong oversight, higher coordination cost.
- Collaboration (peer to peer): Agents share a channel, make decisions collectively, and coordinate without a strict leader. Flexible but harder to debug.
How do the “four Cs” shape my design: decision‑making, control, communication, coordination?
- Decision‑making (command): Who plans and chooses next steps.- Control: Who can actually use tools and perform actions.
- Communication: How context moves (shared thread, selective pass‑off, protocol like MCP).
- Coordination: Execution topology (sequential, parallel, hierarchical, iterative, etc.). Balancing these four determines predictability, cost, and speed.
Which agent communication pattern should I use?
- Shared conversation thread: All agents see the same context. Easiest, but higher token cost and potential distraction.- Pass‑off/message passing: Call each agent with only the needed inputs. More control, more glue code.
- Protocol/tooling (e.g., MCP): Treat another agent/service as a tool. Tight scoping and clear contracts. Choose the least permissive option that still solves your use case.
How do I refactor a single agent into a clear agent‑to‑agent flow?
Identify distinct roles and tools, create one agent per role, and pass only the minimal structured output to the next step. Constrain outputs with typed models to simplify handoffs. Keep prompts short, tool scopes small, and verify each step with simple tests before adding more agents.What coordination strategies are available and when should I use them?
- Sequential pipeline: Ordered stages; great for linear workflows.- Parallel delegation: Independent subtasks; faster when merging is simple.
- Hierarchical coordination: Orchestrator manages dependencies and mixtures of parallel/serial work.
- Iterative debate/refinement: Critic/gatekeeper improves quality on hard problems.
- Voting/Best‑of‑N: Ensembles for reliability; costly if tasks are cheap.
- Role‑playing: Complementary perspectives to clarify and iterate.
- Conditional routing: Classify and send to experts.
- Peer‑to‑peer: Decentralized and robust, but complex.
How can I make flows more deterministic and reliable?
Externalize key decisions into code (branching on typed outputs), use strict input/output schemas, and limit shared context. Add retries with bounded attempts and fallbacks. Where appropriate, insert guardrails to validate inputs/outputs and stop or correct bad states early.What are SDK handoffs and how do they compare to manual pass‑offs?
SDK handoffs let an agent transfer control to another without glue code by declaring downstream agents and mentioning handoffs in instructions. Benefits: simpler plumbing and a single conversation thread. Trade‑offs: agents must reference each other, and complex graphs can be harder to inspect or recover if the thread breaks. Manual pass‑offs give maximum control at the cost of more code.How do I observe, debug, and understand agent‑to‑agent behavior?
- Visualize the graph (e.g., draw a flow diagram from agent definitions).- Inspect traces in your dashboard to see tool calls, turns, and timing.
- Wrap handoffs with callbacks to log what data moved and why the transfer occurred.
- Constrain and log typed inputs/outputs at each boundary to spot drift quickly.
AI Agents in Action, Second Edition ebook for free