Overview

1 What are LLM Agents and Multi-Agent Systems?

Large Language Models can express intentions, plans, and tool-use requests in text, but they cannot directly carry out actions on their own. LLM agents address this gap by wrapping a backbone LLM in an orchestration system that executes tool calls, manages intermediate steps, and returns results for the model to synthesize. Multi-agent systems extend this idea by coordinating multiple specialized LLM agents so they can collaborate on larger tasks that benefit from decomposition, specialization, and combined outputs.

The chapter explains that effective LLM agents depend heavily on planning and tool calling. A typical agent operates through a processing loop: it receives a task, forms or updates a plan, selects and invokes tools, observes results, and repeats until the task is complete or a stopping condition is reached. This loop can produce useful trajectories for debugging and improvement. The chapter also surveys common applications such as report generation, web and deep search, agentic retrieval-augmented generation, coding agents, and computer-use agents, while noting that these systems require monitoring because errors, hallucinations, and poor plans can cascade through later steps.

The chapter also introduces important enhancements and ecosystem standards. Memory allows agents to store and reuse information from past executions, while human-in-the-loop designs let people review plans, approve actions, validate outputs, or redirect failed attempts at the cost of slower execution. Protocols such as MCP, Agent Skills, and A2A help standardize tool access, reusable workflows, and agent-to-agent communication. Finally, the book’s roadmap is framed around building an educational LLM agent framework from scratch: first implementing tools, LLM interfaces, and a core agent loop; then adding MCP, skills, memory, and human oversight; and finally assembling multiple agents into a coordinated multi-agent system.

The applications for LLM agents are many, including agentic RAG, report generation, deep search and computer use, all of which can benefit from MAS.
An LLM agent is comprised of a backbone LLM and its equipped tools.
LLM agents utilize the planning capability of backbone LLMs to formulate initial plans for tasks, as well as to adapt current plans based on the results of past steps or actions taken towards task completion.
An illustration of the tool-equipping process, where a textual description of the tool that contains the tool’s name, description and its parameters is provided to the LLM agent.
The tool-calling process, where any equipped tool can be used.
A mental model of an LLM agent performing a task through its processing loop, where tool calling and planning are used repeatedly. The task is executed through a series of sub-steps, a typical approach for performing tasks.
An LLM agent that has access to memory modules where it can store key information of task executions and load this back into its context for future tasks.
A mental model of the LLM agent processing loop that has memory modules for saving and loading important information obtained during task execution.
An LLM agent processing loop with access to human operators. The processing loop is effectively paused each time a human operator is required to provide input.
Multiple LLM agents collaborating to complete an overarching task. The outcomes of each LLM agent’s processing loop are combined to form the overall task result.
The difference between LLM agent framework developers and application developers
A first look at the llm-agents-from-scratch framework that we’ll build together.
A simple UML class diagram that shows two classes from the llm-agents-from-scratch framework. The BaseTool class lives in the base module, while the ToolCallResult lives in the data_structures module. The attributes and methods of both classes are indicated in their respective class diagrams and the relation between them is also described.
A UML sequence diagram that illustrates how the flow of a tool call. First, an LLM agent prepares a ToolCall object and invokes the BaseTool, which initiates the processing of the tool call. Once completed, the BaseTool class constructs a ToolCallResult which then gets sent back to the LLM agent.
The build plan for our llm-agents-from-scratch framework

Summary

  • LLMs have become very powerful text generators that have been applied successfully to tasks like text summarization, question-answering, and text classification, but they have a critical limitation in that they cannot act; they can only express an intent to act (such as making a tool call) through text. That’s where LLM agents come in to bring in the ability to carry out the intended actions.
  • Applications for LLM agents are many, such as report generation, deep research, computer use and coding.
  • With MAS, individual LLM agents collaborate to collectively perform tasks.
  • Many applications for LLM agents can further benefit from MAS. In principle, MAS excel when complex tasks can be decomposed into smaller subtasks, where specialized LLM agents outperform general-purpose LLM agents.
  • LLM agents are systems comprised of an LLM and tools that can act autonomously to perform tasks.
  • LLM agents use a processing loop to execute tasks. Tool calling and planning capabilities are key components of that processing loop.
  • Protocols like MCP, Agent Skills, and A2A have helped to create a vibrant LLM agent ecosystem that is powering the growth of LLM agents and their applications.
  • MCP is a protocol developed by Anthropic that has paved the way for LLM agents to use third-party provided tools.
  • Agent Skills is another protocol that originated from Anthropic that standardizes how reusable workflows are documented and used by LLM agents
  • A2A is a protocol developed by Google to standardize agent-to-agent interactions in MAS.
  • In this book, we’ll primarily be framework developers, building our own LLM agent framework to learn the internals of LLM agents more deeply.
  • Supplementary materials in the form of additional example notebooks and capstone projects are provided in the book’s GitHub repository to deepen your learning.

FAQ

What is an LLM agent?An LLM agent is an autonomous system made up of a backbone LLM and tools. It acts on tool-call requests and plans generated by the LLM to perform tasks on behalf of a user. The LLM provides the intent and reasoning, while the surrounding agent system orchestrates and executes actions.
Why are LLMs alone insufficient for acting on user requests?LLMs are text generators: they can describe what they intend to do, formulate plans, and generate tool-call requests, but they cannot execute actions by themselves. For example, an LLM can say it will search the web for croissant prices in New York City, but an LLM agent is needed to actually process the web-search request, return results, and continue the task.
What is tool calling in an LLM agent?Tool calling is the process where an LLM is given textual descriptions of available tools, including their names, purposes, and parameters. The LLM then generates a structured request, often in JSON, identifying which tool to use and what parameter values to pass. The agent system executes the tool call and sends the result back to the LLM for synthesis.
What capabilities does a backbone LLM need to be useful in an agent?A backbone LLM should be good at planning and tool calling. Planning allows it to create initial task strategies and adapt them as results come in. Tool calling allows it to select appropriate tools, fill in parameters, and request external actions such as web searches, calculations, code execution, or data retrieval.
What is the processing loop of an LLM agent?The processing loop is the core execution cycle where the agent repeatedly plans, calls tools, observes results, and decides what to do next. A task is usually broken into sub-steps. At each step, the agent synthesizes progress so far, creates the next plan, may execute tool calls, and continues until the task is complete or a stopping condition is reached.
What is an LLM agent trajectory or rollout?An LLM agent trajectory, also called a rollout, is a record of the steps taken during task execution. It can include the agent’s plans, tool-call requests, tool results, and intermediate conclusions. Trajectories are usually not shown to the end user, but they are valuable for debugging, monitoring, and improving agent behavior.
What are common real-world applications of LLM agents?Common applications include report generation, web search, deep research, agentic retrieval-augmented generation (RAG), coding agents, and computer-use agents. These systems can collect information, synthesize findings, query knowledge stores, write or execute code, browse the web, and interact with software applications.
What is a multi-agent system?A multi-agent system, or MAS, combines multiple LLM agents into one larger system. Each agent may specialize in a different part of the task, and their results are coordinated to solve the broader problem. For example, one agent might find the best-value croissants in New York City while another finds the best-value pizza, and their results are combined into a food-value report.
When can multi-agent systems be beneficial?Multi-agent systems are useful when a complex task can be decomposed into smaller subtasks that specialized agents can handle better than one general-purpose agent. They can improve performance in areas such as report generation, coding projects, deep research, and workflows that benefit from distinct roles or expertise. However, MAS can also introduce new coordination challenges and failure modes.
What protocols and standards are important for LLM agents?Important protocols and standards include Anthropic’s Model Context Protocol (MCP), which standardizes access to third-party tools and resources; Agent Skills, which defines reusable workflows that agents can discover and execute; and Google’s Agent2Agent Protocol (A2A), which standardizes communication between agents, including agents built with different frameworks.

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Build a Multi-Agent System (from Scratch) ebook for free
choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Build a Multi-Agent System (from Scratch) ebook for free