Overview

1 The rise of AI agents

This chapter introduces AI agents as the next step beyond simple LLM chatbots and tool-using assistants. Whereas a basic generative assistant responds to a prompt and stops, an agent can pursue a goal over multiple steps by sensing context, planning actions, using tools, observing results, and revising its approach. The chapter frames “agentic” systems as software that can perceive, decide, and act with some autonomy, while emphasizing that autonomy should be paired with human oversight for risky or high-stakes actions.

The chapter explains the practical foundations of agent design. Agents act through tools, which may wrap APIs, databases, applications, or other external systems, and robust agents must handle tool failures, retries, fallbacks, and approval gates. The Model Context Protocol is presented as a major shift in tool integration because it standardizes how agents discover and call external tool servers, reducing the need for custom per-agent integrations. The chapter also organizes agent capabilities into five functional layers: persona, actions and tools, reasoning and planning, knowledge and memory, and evaluation and feedback.

The chapter then broadens from single agents to multi-agent systems. It explains why multiple agents are useful for specialization, parallel work, context management, and problems that naturally involve multiple independent actors. Several coordination patterns are introduced, including assembly-line flows, hub-and-spoke orchestration, and collaborative teams of agents. The chapter closes by positioning the rest of the book as a practical path from understanding these layers and patterns to building more capable agentic systems, including long-horizon agents, distributed systems, and real-world production examples.

Common patterns for directly communicating with an LLM or an LLM with tools. If you’ve used earlier versions of ChatGPT, you experienced direct interaction with the LLM. No proxy agent or other assistant interjected on your behalf. Today, ChatGPT itself has plenty of tools it uses to help respond from web search, coding and so on, making the current version function like an assistant.
Top: an assistant performs a single or multiple tasks on behalf of a user, where each task requires approval by the user. Bottom: An agent may use multiple tools autonomously without human approval to complete a goal,
The four-step process agents use to complete goals: –Sense (receive input – goal or feedback) -> Plan (define the task list that completes the goal) -> Act (execute tool defined by task) -> Learn (observe the output of the task and determine if goal is complete or process needs to continue) ->
For an agent to use a tool, that tool must first be registered with the agent in the form of a JSON description/definition.l Once the tool is registered, the agent uses that tool in a process not unlike calling a function in Python.
An agent connects to an MCP server to discover the tools it hosts and the description of how to use each tool. When an MCP server is registered with an agent it internally calls list_tools to find all the tools the server supports and their descriptions. Then, like typical tool use internally, it can determine the best way to use those tools based on the respective tool description.
the five functional layers of agents – Persona, Actions & Tools, Reasoning & Planning, Knowledge & Memory, and Evaluation & Feedback
The Persona layer of an agent is the core layer, consisting of the system instructions that define the role of the agent, and how it should complete goals and tasks. It may include how to reason/plan and access knowledge and memory
The role of Actions & Tools within the agent, and how tools can also help power the other agent layers. Tools are a core extension of agents but are also fundamental to the functions used in the upper agent layers
The Reasoning & Planning of agents and how agentic thinking may be augmented. Reasoning may come from many forms, from the underlying model powering the agent, to prompt engineering and even through the use of tools
The Knowledge & Memory layer and how it interacts with and uses the same common forms of storage across both types. Agent knowledge represents information the LLM was not initially trained with but is later augmented. Likewise, memories represent past experiences and interactions of the user, agent or even other systems.
The Evaluation & Feedback layer and the mechanisms used to provide them. From tools which may help evaluate tool use, knowledge retrieval (grounding) and provide feedback, to other agents and workflows that provide similar functionality
The agent flow pattern of assembly with multiple agents. The flow starts with a planning agent that breaks down the goal into a high-level plan that it then passed to the research agent, which may execute the research tasks on the plan and after completing will pass to the content agent, which is responsible for completing the later tasks of the plan, such as writing a paper based on the research
The agent orchestration pattern, often referred to as hub-and-spoke. In this pattern, a central agent asks as the hub or orchestrator to delegate tasks to each of work agents. Worker agents complete their respective tasks and return them to the hub, which determines when the goal is complete and outputs the results.
A team of collaborative agents. The agent collaboration pattern allows agents to interact as peers to allow back and forth communication from one agent to another. In some cases, a manager agent may work as a user proxy and help keep collaborating agents on track

Summary

  • An AI agent has agency, the ability to make decisions, undertake tasks, and act autonomously on behalf of someone or something, powered by large language models connected to tools, memory, and planning capabilities.
  • An agents agency provides them the ability to process with an autonomous loop called Sense-Plan-Act-Learn process.
  • Assistants use tools to perform single tasks with user approval, while agents have the agency to reason, plan, and execute multiple tasks independently to achieve higher-level goals.
  • The four patterns we see LLMs being used in include: direct user interaction with LLMs, assistant proxy (reformulating requests), assistant (tool use with approval), and autonomous agent (independent planning and execution).
  • Agents receive goals, load instructions, reason out plans, identify required tools, execute steps in sequence, and return results, all while making autonomous decisions.
  • Agents use actions, tool functions (extensions that wrap API calls, databases, and external resources) to act beyond their code base and interact with external systems.
  • Model Context Protocol (MCP), developed by Anthropic in November 2024, serves as the "USB-C for LLMs," providing a standardized protocol that allows agents to connect to MCP servers, discover available tools, and use them seamlessly without custom integration code.
  • MCP addresses inconsistent tool access, unreliable data responses, fragmented integrations, code extensibility limitations, implementation complexity, and provides easy-to-build standardized servers.
  • AI Agent development can be expressed in terms of five functional layers: Persona, Tools & Actions, Reasoning & Planning, Knowledge & Memory, and Evaluation & Feedback.
  • The Persona layer represents the core role/personality and instructions an agent will use to undertake goal and task completion.
  • The Tools & Actions layer provides the agent with the functionality to interact and manipulate the external world.
  • The Reasoning & Planning layer enhances an agent's ability to reason and plan through complex goals that may require trial-and-error iteration.
  • The Knowledge & Feedback layer represents external sources of information that can augment the agent’s context with external knowledge or relate past experiences (memories) of previous interactions.
  • The Evaluation & Feedback layer represent external agent mechanisms that can assist in improved response accuracy, encourage goal/task learning and increased confidence in overall agent output.
  • Multi-agent systems include patterns such as Agent-flow assembly lines (sequential specialized workers), agent orchestration hub-and-spoke (central coordinator with specialized workers), and agent collaboration teams (agents communicating and working together with defined roles).
  • The Agent-Flow pattern (sequential assembly line) is the most straightforward multi-agent implementation where specialized agents work sequentially like an assembly line, ideal for well-defined multi-step tasks with designated roles.
  • The Agent Orchestration pattern is a hub-and-spoke model where a primary agent plans and coordinates with specialized worker agents, transforming single-agent tool use into multi-agent delegation.
  • The Agent Collaboration pattern represents agents in a team-based approach. Agents communicate with each other, provide feedback and criticism, and can solve complex problems through collective intelligence, though with higher computational costs and latency.
  • AI agents represent a fundamental shift from traditional programming to natural language-based interfaces, enabling complex workflow automation from prompt engineering to production-ready agent architecture.

FAQ

What is an AI agent in the context of LLM-powered systems?

An AI agent is software that perceives its environment, decides what to do, and takes action to achieve a goal using the resources provided by an LLM. Unlike a simple chatbot that responds to a prompt and stops, an agent can work through multiple steps, use tools, observe results, revise its plan, and continue until the goal is complete.

How is an agent different from an assistant?

The chapter draws the line mainly around autonomy. An assistant can use tools on behalf of a user but typically requires approval for each action and does not independently plan and complete multi-step goals. An agent, by contrast, receives a higher-level objective, reasons about what needs to happen, creates a plan, chooses tools, and executes steps with more autonomy. In production, however, high-stakes actions such as sending emails, making purchases, or deleting records should still be gated by human approval.

What does “agentic” mean?

“Agentic” describes systems or behaviors that exhibit agency: the ability to perceive, decide, and act with some degree of autonomy in pursuit of a goal. In this engineering context, an agentic system is one that follows these operational patterns rather than merely producing text that resembles intentional behavior.

What are the four LLM interaction patterns described in the chapter?

The chapter describes four patterns: direct LLM chat, tool-augmented LLMs, assistants, and agents. Direct chat only generates text. Tool-augmented LLMs can invoke a single tool such as web search or image generation. Assistants can perform tasks with per-step user approval. Agents operate at the goal level, deciding the steps needed to complete multi-step work while using appropriate oversight for risky actions.

What is the Sense, Plan, Act, Learn cycle?

The Sense, Plan, Act, Learn cycle is the basic loop agents use to complete goals. In the Sense step, the agent receives input, context, or feedback. In the Plan step, it reasons about the goal and defines tasks. In the Act step, it executes tools or actions. In the Learn step, it observes the results, evaluates whether the goal is complete, updates memory or context if needed, and decides whether to continue.

Why are tools important for AI agents?

Tools allow agents to act outside the LLM itself. They may wrap API calls, databases, external applications, search systems, files, or other resources. Agents use tools to complete tasks, retrieve context, update memory, evaluate outputs, and interact with external systems. Tool use is what lets an agent book a flight, query a database, send a message, create an image, or update a project tracker rather than merely describe how to do those things.

What is the Model Context Protocol (MCP)?

The Model Context Protocol, or MCP, is an open standard developed at Anthropic and released in November 2024. It is based on JSON-RPC 2.0 and provides a consistent way for AI systems to connect to external tools and services. MCP lets agents discover available tools from an MCP server, understand how to call them, and use them without every developer having to build custom tool wrappers for each agent.

Why is MCP sometimes called the “USB-C for LLMs and agents”?

MCP is compared to USB-C because it standardizes connection between agents and external capabilities. Instead of each agent needing bespoke integrations for calendars, Slack, databases, Jira, search, or other services, an agent can connect to MCP servers that expose tools in a predictable format. This reduces fragmented integrations, inconsistent tool access, unreliable response formats, and duplicated implementation work.

What are the five functional layers of an agent?

The five functional layers are Persona, Actions & Tools, Reasoning & Planning, Knowledge & Memory, and Evaluation & Feedback. Persona defines the agent’s role, instructions, style, and constraints. Actions & Tools let the agent affect the world or retrieve context. Reasoning & Planning controls how the agent thinks through tasks. Knowledge & Memory provide external information and remembered experience. Evaluation & Feedback help assess, critique, and improve the agent’s outputs and actions.

Why use multi-agent systems instead of a single agent?

Multi-agent systems are useful when a single agent becomes limited by scale, scope, tool overload, or context size. Multiple agents can specialize in different roles, work in parallel, divide context across several smaller windows, and model problems that naturally involve multiple actors. The chapter introduces several multi-agent patterns, including agent-flow assembly lines, hub-and-spoke orchestration, and collaborative teams of agents.

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • AI Agents in Action, Second Edition ebook for free
choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • AI Agents in Action, Second Edition ebook for free