Overview

1 What is an AI agent?

Recent advances in large language models and retrieval-augmented generation have set the stage for a new wave of “agents.” An LLM agent is a system that uses an LLM as its reasoning engine to autonomously choose actions and pursue goals—the model acts as the agent’s brain. Thanks to LLMs’ broad generalization and emerging reasoning capabilities, agents can analyze context, form plans, and adapt to unfamiliar tasks rather than following rigid, predefined logic.

What makes an agent distinct is its tool use and dynamic decision-making loop: the LLM evaluates context, chooses a tool, observes results, and decides whether to continue or stop. Systems can exhibit different degrees of agency, from single LLM calls and fixed chains to routing, tool selection, multi-step termination control, and even tool creation. Because agentic workflows involve multiple calls and branching paths, they trade higher capability for cost, latency, and potential error compounding. The chapter offers practical criteria for deciding when to use an LLM (unstructured data, varied inputs) and when an agent is warranted (unpredictable steps, sufficient task value, manageable and detectable error risk).

The book takes a from-scratch, code-first path: begin with a basic ReAct agent (reason–act–observe) that selects and executes tools, then progressively add knowledge access (RAG) for long-term memory, planning for decomposing complex goals, and reflection for self-correction. It culminates in multi-agent systems that coordinate specialized roles, along with essential practices for monitoring and evaluation to debug, analyze costs and latency, and assess real-world performance. Readers are guided through this progression with Python-based implementations, supported by a minimal toolchain (local environment, API keys) and cost-aware development tips.

Example of a language model’s generalization capability.
User requests flow through the research agent, which branches into multiple searches and synthesis.
The LLM Agent's decision loop is an iterative process of LLM decision-making and tool use.
Progression of agency levels in LLM applications.
Progressive enhancement of agent architecture from basic LLM-tool integration (chapters 2-4) through advanced capabilities like memory, planning, and reflection (chapters 5-8) to multi-agent coordination (chapter 9).
Basic agent operational cycle demonstrating autonomous reasoning and tool use.
Agent with cognitive capabilities
Multi-agent roles for a software development task.

Summary

  • An LLM agent is a program that determines the next action and evaluates goal completion (i.e., termination conditions) based on the output of an LLM, in accordance with a given objective and context.
  • LLMs excel at generalization—handling diverse tasks without requiring task-specific retraining through capabilities such as few-shot learning, zero-shot learning, and reasoning with thinking tokens.
  • Agency levels range from simple code execution to fully autonomous systems, with increasing autonomy as the LLM gains more control over task execution, action selection, and option determination.
  • You should use an LLM agent for tasks that involve unstructured data analysis, require multiple unpredictable steps, have sufficient value to justify computational costs, and allow for error detection and correction.
  • Building agents from scratch provides a deep understanding of core principles, clear visibility into the context flow, improved debugging capabilities, and the flexibility to create custom solutions that go beyond framework limitations.

FAQ

What is an LLM agent?An LLM agent is a system that uses a Large Language Model as its reasoning engine to autonomously decide what to do and execute actions (via tools) to achieve a goal. It doesn’t just generate text—it plans, acts, observes results, and decides when to stop.
How is an LLM agent different from a regular LLM or traditional software?Two key differences: tool use and dynamic decision-making. Unlike a plain LLM that only outputs text, an agent can call external tools (APIs, code, search, databases). Unlike fixed-logic software, an agent chooses its next steps based on context, iterating until goals are met.
What problems are LLM agents good at, and when should I avoid them?They shine on multi-step, context-dependent tasks requiring information gathering, synthesis, or tool use. Avoid agents for simple, deterministic tasks where traditional code or a single LLM call is cheaper, faster, and more reliable. Consider costs, latency, and risk of compounding errors.
What is the core decision loop of an LLM agent?Iterate through: (1) evaluate context and decide whether a tool is needed, (2) execute the chosen tool, (3) add tool results back into context, and (4) decide to continue or terminate. This loop enables adaptive, goal-directed behavior.
What does “tool use” mean and why is it essential?Tool use lets the agent invoke external functions (search, code execution, databases, APIs, file ops, email, etc.). It extends the LLM beyond text generation so it can fetch up-to-date information, perform calculations, and take real-world actions.
What are the levels of agency in LLM applications?Agency increases from left to right: traditional code, single LLM call, chain of LLM calls, router (LLM chooses next step), tool use (LLM chooses tools), multi-step control (LLM decides when to stop), and tool creation (LLM writes new tools/code).
What is a ReAct agent and what are its components?ReAct (Reason + Act) is the simplest agent pattern: an LLM plus a set of tools running in a loop of reason, act, and observe. The LLM selects tools based on context, executes them, integrates results, and repeats until the goal is satisfied.
How do planning, memory, and reflection make agents smarter?Planning creates a roadmap of subtasks to reduce trial-and-error. Long-term memory stores user/task knowledge beyond the current session for personalization and reuse of successful strategies. Reflection analyzes past steps to correct mistakes, adjust strategies, and improve over time.
How is long-term memory implemented in practice?Commonly via Retrieval-Augmented Generation (RAG): relevant items are retrieved from an external knowledge base (e.g., a vector database) and injected into the prompt so the LLM can use accurate, domain-specific, or up-to-date information.
What are multi-agent systems and when should I use them?Multiple specialized agents, each with role-specific instructions and tools, collaborate on complex tasks. Use them when a single agent would face overly long prompts, too many tools, or diverse specialties. They improve scalability and specialization but can be harder to control and predict.

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Build an AI Agent (From Scratch) ebook for free
choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Build an AI Agent (From Scratch) ebook for free