AI Agents in Action, Second Edition you own this product

Intelligent workflows with LLMs, MCP, A2A, and more

Micheal Lanham

MEAP began November 2025
Last updated May 2026
Publication in July 2026 (estimated)

ISBN 9781633434530
392 pages (estimated)

Included with a Manning Online subscription

printed in black & white

available in Complex Chinese

catalog / Data Science / AI / AI Agents

resources: Source code Book forum Source code on Github

table of content

1 The rise of AI agents

1.1 Defining agents and agentic thinking

1.1.1 Defining agents and agentic thinking

1.1.2 Understanding agent/assistant and LLM patterns

1.1.3 Thinking like agents: sense, plan, act, learn

1.1.4 Agents act with tools

1.2 Introducing the Model Context Protocol (MCP)

1.3 Understanding the five functional layers of an agent

1.3.1 The Agent Persona

1.3.2 Agent Actions & Tools

1.3.3 Agent Reasoning & Planning

1.3.4 Agent Knowledge & Memory

1.3.5 Agent Evaluation & Feedback

1.4 Advancing onto multi-agent systems

1.4.1 The agent-flow assembly line

1.4.2 Agent orchestrations (hub-and-spoke)

1.4.3 Agent collaboration (teams of agents)

1.5 Next steps

1.6 Summary

2 Core components: Large Language Models, prompting, and agents

2.1 Understanding Large Language Models

2.1.1 LLMs: Probabilistic Token Machines

2.1.2 What is a token?

2.1.3 Tuning Temperature, Top P, and more

2.2 Controlling LLMs with prompt engineering (Agent Persona)

2.2.1 Applying core prompt techniques

2.2.2 Thinking like an LLM

2.2.3 Avoiding common prompt pitfalls

2.3 Building an agent with OpenAI Agents

2.3.1 Building a minimal agent

2.3.2 Setting the Agent Model and other parameters

2.3.3 Controlling inputs and typed outputs

2.3.4 Tracing agents

2.4 Enhancing agents through tool integration

2.4.1 Providing agents with tools

2.4.2 Tracing agentic tool use

2.5 Exercises

2.6 Summary

3 Actions with Model Context Protocol for AI agents

3.1 Understanding MCP fundamentals for agent development

3.1.1 The standardization problem MCP solves

3.1.2 MCP architecture: Clients, servers, and services

3.1.3 Core components: Tools, resources, and prompts

3.1.4 MCP deployment patterns for agents

3.1.5 MCP powers the functional agent layers

3.2 Getting started with MCP Servers

3.2.1 Coding up an MCP Server for Claude

3.2.2 Using the MCP inspector

3.2.3 Understanding MCP transport types

3.2.4 From desktop to agents: the key differences

3.3 Actioning MCP servers for Agents

3.3.1 Actioning local MCP servers over STDIO with agents

3.3.2 Actioning local MCP servers over SSE with agents

3.3.3 Connecting to the standard MCP servers

3.4 Building MCP servers for agents

3.4.1 Converting tools to an MCP server

3.4.2 Consuming MCP servers locally or remotely

3.5 Exercises

3.6 Summary

4 Architecting and building multi-agent systems

4.1 Architecting multi-agent systems

4.1.1 Decision-making for agent systems

4.1.2 Communicating with shared-memory, message-passing, and MCP

4.1.3 Channeling multi-agent coordination strategies

4.2 Balancing agents with agentic flows

4.2.1 Transforming agents to agent flows

4.2.2 Building an Agent-to-Agent flow

4.2.3 Agency and decision making in agent flows

4.3 Understanding handoffs in aAgent flows

4.3.1 Agent-to-agent flow with handoffs

4.3.2 Visualizing agent flows

4.3.3 Monitoring the handoff

4.4 Validating agent flows with guardrails

4.4.1 Implementing input and output guardrails

4.4.2 Using agents as guardrails

4.4.3 Adding guardrails to pass off agent flows

4.5 Exercises

4.6 Summary

5 Agent reasoning and planning

5.1 Understanding LLM Reasoning and Planning

5.1.1 Chain of Thought Reasoning

5.1.2 ReAct Paradigm (Reasoning + Acting + Observing)

5.1.3 Planning with LLMs

5.2 Instructing agents to reason and plan

5.2.1 Applying CoT to an Agent

5.2.2 Implementing ReAct with Agents

5.3 Advanced reasoning with agents

5.3.1 Tree of Thought

5.3.2 Reflexion

5.3.3 Selecting the right pattern for your agents

5.4 Utilizing the Sequential Thinking MCP Server

5.4.1 Unchaining the Sequential Thinking Server

5.4.2 Revisiting time travel problems with Sequential Thinking

5.4.3 Advanced reasoning with sequential thinking

5.5 Exercises

5.6 Summary

6 Working with memory and knowledge RAG for agents

6.1 Understanding retrieval in AI applications

6.1.1 The basics of retrieval augmented generation (RAG)

6.1.2 Delving into semantic search and document indexing

6.1.3 Applying vector similarity search

6.2 Vector databases and similarity search

6.2.1 Demystifying document embeddings

6.2.2 Querying document embeddings from Chroma

6.3 Building practical RAG knowledge agents

6.3.1 Everything begins with search and relevance

6.3.2 Building a vector search RAG agent

6.3.3 Building a hybrid search RAG agent

6.4 Adding memory to agents with MCP

6.4.1 Understanding memory form and agent function

6.4.2 Implementing a graph database for memory using MCP

6.4.3 Creating hybrid memory systems with MCP

6.4.4 Semantic augmented memory and applications to semantic, episodic, and procedural memory

6.4.5 Uncluttering memory with compression and forgetting

6.5 Exercises

6.6 Summary

7 Building robust agents with evaluation and feedback

7.1 Introducing agent evaluation and feedback

7.2 Implementing test-driven agent development

7.2.1 Exploring TDAD in practice

7.2.2 Coding and testing the RAG agent

7.2.3 Refactoring the agent

7.2.4 Extending evaluation with an agent evaluator

7.3 Employing grounding, critic, and evaluation agents

7.3.1 Reviewing the grounding agent

7.3.2 Grounding the RAG agent

7.3.3 Implementing grounding agents as guardrails

7.3.4 Understanding the role of rubrics in evaluation

7.3.5 Building a rubric critic agent

7.4 Phoenix for evaluation and feedback

7.4.1 Connecting to Phoenix

7.4.2 Adding metadata and session tracking

7.4.3 Experimenting with evaluators

7.4.4 Providing feedback with Annotations

7.5 Exercises

7.6 Summary

8 Deploying agents and agentic systems

8.1 Strategies for consuming agents

8.1.1 Embedding real-time voice agents into web applications

8.1.2 Hosting agents through an API

8.1.3 Consuming an agent web service in a web application

8.2 Dockerizing agent systems

8.2.1 Containerizing an agent microservice

8.2.2 Orchestrating agentic systems with Docker Compose

8.2.3 Externalizing local agent microservices

8.3 Considering advanced deployment strategies

8.3.1 Choosing a runtime: edge, API, or event-driven

8.3.2 The three “wires” of communication

8.3.3 Practical multi-agent topologies that adapt well

8.3.4 State, memory, and idempotency

8.3.5 Release engineering for agents (prompts, tools, models)

8.3.6 Observability matters

8.3.7 Reliability patterns: timeouts, fallbacks, and budgets

8.3.8 Cost control and model routing

8.4 Security, safety, and governance in production

8.4.1 A quick threat model for agentic systems

8.4.2 Identity and access—for people, services, and agents

8.4.3 Secrets and configuration management

8.4.4 Tool safety: sandboxing and egress control

8.4.5 Prompt-injection and data-exfiltration defenses

8.4.6 Safety and policy enforcement

8.5 Exercises

8.6 Summary

9 Understanding the Agentic Loop

9.1 Peeling back the three agentic loop layers

9.1.1 Layer one – the inner loop (Sense-Plan-Act-Learn)

9.1.2 Layer two – the task loop

9.1.3 Layer three – the meta loop

9.2 Layer 2 - Looping with a Deep Research Agent

9.2.1 Creating the initial state and plan

9.2.2 Adding the tools

9.2.3 Understanding iteration body output

9.2.4 The termination gate

9.2.5 Coding the deep research loop

9.2.6 Synthesizing the final output

9.2.7 When to use an agentic loop

9.2.8 Building a repetitive task loop agent

9.3 Layer 3 – Multi-agent orchestration loops

9.4 Building collaboration agentic loops

9.5 Exercises

9.6 Summary

10 Exploring the cognitive agent that thinks, monitors, and adapts

10.1 Understanding agent cognition and metacognition as engineering concepts

10.1.1 The five failure modes of capable-but-not-cognitive agents

10.1.2 From reasoning primitives to cognitive architecture

10.1.3 Defining cognition for agents

10.1.4 Defining metacognition for agents

10.1.5 Three theoretical foundations

10.2 Mapping the mind into a cognitive agent architecture

10.2.1 Architecture overview

10.2.2 The Cognitive Workspace

10.2.3 The Perception Module

10.2.4 The Planning Module

10.2.5 The Execution Module

10.2.6 The Evaluation Module

10.2.7 The Attention Module

10.2.8 The Memory Module and the MCP memory server

10.3 Building and running the Cognitive Agent

10.3.1 The cognitive loop

10.3.2 A complete cognitive agent with MCP

10.3.3 Walkthrough: watching the cognitive cycle in action

10.3.4 Confidence-gated execution

10.3.5 Stagnation detection and strategy pivoting

10.3.6 Knowledge boundary awareness

10.3.7 Emergent behaviors

10.4 Measuring cognitive capability and looking ahead

10.4.1 Cognitive efficiency metrics

10.4.2 Before and after: measuring the impact

10.4.3 The road to more general agents

10.5 Exercises

10.6 Summary

11 Tips for building agentic systems

11.1 The core layer - persona

11.1.1 Tools and agent actions

11.1.2 Reasoning and planning

11.1.3 Knowledge and memory

11.1.4 Evaluation and feedback

11.2 Tips for building a customer support agent

11.3 Tips for building a RAG agent system

11.4 Tips for building a Deep Research agent system

11.5 Summary

Overview

5 Agent reasoning and planning

Reasoning and planning are presented as the core capabilities that turn large language models into effective agents. While base LLMs operate as next-token predictors and tend to handle one task at a time, prompt strategies and newer reasoning-centric models enable decomposition of goals into steps and the construction of executable plans. Two practical, low-level patterns anchor this shift: Chain of Thought (CoT), which elicits explicit step-by-step reasoning, and ReAct, which interleaves thinking with tool use and observation. CoT improves transparency and reproducibility but adds tokens and latency; ReAct extends this by letting agents gather evidence via tools, update their thinking, and iteratively pursue a goal.

The chapter then shows how to instruct agents to apply these patterns in practice, first by embedding CoT-style instructions to surface their thought process, and then by enabling ReAct loops with tools so agents can act, observe results, and refine plans. For harder problems, advanced strategies are introduced: Tree of Thought (ToT) explores multiple branching reasoning paths and prunes weaker ones, trading speed for breadth; Reflexion adds a solver–critic loop that learns from mistakes via targeted feedback, improving solutions across attempts. Selecting among CoT, ReAct, ToT, and Reflexion depends on problem complexity, time sensitivity, and computational budget, with guidance on when each pattern fits best.

Finally, the Sequential Thinking MCP server provides a structured scratchpad for multi-step reasoning and planning. Rather than “thinking” for the agent, it records and manages thoughts, revisions, branches, and hypotheses, helping agents maintain a global plan, adjust the number of reasoning steps, and verify solutions. Used alongside tool calls and tracing, it makes ReAct- and ToT-style workflows more controllable and auditable—even though correctness still depends on the underlying model and instructions. The chapter emphasizes combining these techniques judiciously: use lightweight patterns for routine tasks, escalate to branching or feedback-driven loops for ambiguity and depth, and employ the Sequential Thinking server to keep long-horizon plans coherent as agents reason, act, observe, and refine.

compares differences to prompting strategies with reasoning and non-reasoning models

compares the LLM thought process when using CoT prompting on the left and not using any explicit reasoning instructions.

demonstrates the ReAct paradigm of reasoning in a sequential diagram. The agent thought process occurs in the loop where the LLM first reasons, plans and then executes on the plan. After each execution of a task (tool) the LLM observes the output and then reasons if it needs to continue looping or the plan has been completed.

shows a partial workflow of the tree of thought process where thoughts are branched into nodes and each node is executed to determine which is the best path to follow.

illustrates the Reflexion reasoning strategy in a step-by-step flowchart. Each step in the flow demonstrates a task, decision, or output that can occur in an LLM or deterministic code.

shows the Traces page for the Time Travel Agent execution, from here we can see how the agent goes through the ReAct pattern of reasoning, acting, observing and reasoning again.

Summary

Large Language Models don’t “think” by default—they’re token-predictors—so agents must inject structured reasoning and planning to achieve multi-step goals.
Chain-of-Thought (CoT) prompting turns a model’s hidden intuition into explicit, step-by-step thoughts that are easy to debug, at the cost of extra tokens and latency.
ReAct augments CoT with tool calls: Reason → Act → Observe → Repeat, letting an agent gather information dynamically while iteratively refining its plan.
High-level planning goes beyond single chains: agents can ask an LLM to draft a strategic outline, then revise it as real-world feedback arrives.
Tree-of-Thoughts (ToT) explores many branches in parallel, pruning losers and expanding promising paths—powerful for complex search tasks but extremely token-hungry.
Reflexion wraps a solver–critic loop around any reasoning strategy: the critic provides feedback, the solver revises, and the cycle repeats until the answer passes a self-defined check.
Choosing a strategy is task-dependent: CoT for logic puzzles, ReAct for tool-heavy look-ups, ToT for deep planning, Reflexion for iterative improvement—mix and match as needs evolve.
The Sequential Thinking MCP server acts as a universal “scratchpad” tool; agents write, revise, and branch thoughts there while still using ReAct or ToT patterns externally.
Combining strategies scales up reasoning: e.g. CoT to draft a plan, ReAct+ToT to execute/branch it, Reflexion to self-grade and retry—expect high latency but high reliability.
Guardrails, schemas, and typed outputs remain essential; reasoning output should be validated just like any other agent I/O to avoid cascading errors.
Model choice matters: newer reasoning-native models (e.g. GPT-4o family) handle these patterns with fewer prompts, but even base models can reason when coached properly.
Keep tool lists lean; every additional tool inflates ReAct loops and Sequential Thinking calls—scope each agent to < 10 highly relevant tools.
A production-ready reasoning agent pairs structured prompts, the right reasoning strategy, Sequential Thinking for thought tracking, and guardrails for validation—yielding plans that can adjust, retry, and successed autonomously.

FAQ

What do “reasoning” and “planning” mean for LLM-powered agents?

Reasoning is the model’s ability to decompose a goal into smaller tasks; planning is organizing those tasks into a coherent strategy to achieve the goal. Without reasoning, agents tend to handle one step at a time and struggle with multi-step objectives. Modern “reasoning” models bake in these abilities, but you can also elicit them with prompt patterns.

How do non-reasoning LLMs differ from reasoning models?

Non-reasoning LLMs operate as next-token predictors and often jump straight to an answer, which can fail on multi-step problems. Reasoning models (or models guided by reasoning prompts) explicitly outline intermediate steps, choose tools, and verify progress. This typically yields more reliable results on complex tasks at the cost of extra tokens and latency.

What is Chain-of-Thought (CoT) prompting and when should I use it?

CoT asks the model to think step-by-step before giving a final answer, making complex tasks more tractable. It’s useful for logic puzzles, math-like reasoning, and any goal where intermediate steps matter. Benefits include transparency and structure; trade-offs include higher token usage, added latency, and occasional need to tweak the step template for a given model.

What is the ReAct paradigm and how is it different from CoT?

ReAct interleaves reasoning (thoughts) with actions (tool/API calls) and observations in a loop: think → act → observe → adjust. Unlike pure CoT (internal reasoning only), ReAct lets agents query tools, gather evidence, and update plans dynamically. It’s ideal when external knowledge or actions are needed to complete a task.

How do I apply CoT and ReAct inside an agent?

Provide explicit agent instructions: for CoT, “think step-by-step, then answer”; for ReAct, “think, optionally call a tool, observe, continue thinking.” Give the agent access to the necessary tools and encourage verification before finalizing. Expect clearer, more reproducible behavior and, when using ReAct, rely on traces/logging to inspect tool calls and outcomes.

How is “planning with LLMs” different from CoT or ReAct?

Planning maintains a global view across multiple CoT/ReAct loops, organizing sub-goals toward an overall objective. The agent can draft a high-level plan, execute parts of it with tools, observe changes, and revise the plan. This is helpful for longer-horizon, multi-step objectives (for example, itineraries or project workflows).

What are Tree-of-Thoughts (ToT) and Reflexion, and when should I use them?

ToT explores multiple reasoning branches, evaluates them, and prunes weak paths—useful for complex planning and search-like problems but computationally heavy. Reflexion runs an attempt → critique → revised attempt loop, allowing the agent to learn from mistakes; it works well for ambiguous or difficult problems but needs a way to judge correctness (a critic, oracle, or test tool).

How do I choose the right reasoning strategy?

General guidance: use CoT for low-cost, step-by-step logic; ReAct for tool- or knowledge-intensive tasks; ToT for complex planning or game-like search; Reflexion for iterative improvement. Consider latency and token budget: ToT and Reflexion can be expensive; CoT is cheaper. If no external tools are needed, ReAct may be overkill.

What is the Sequential Thinking MCP server, and what does it actually provide?

It’s a Model Context Protocol (MCP) tool that acts as a structured scratchpad for thoughts and plans. It does not “think” itself; it stores, branches, revises, and tracks thought steps via a single, JSON-driven function. Agents can combine it with CoT, ReAct, ToT, and Reflexion to maintain context and manage complex, revisable plans.

What practical tips and pitfalls should I expect when building reasoning agents?

Expect variability across models; you may need to tailor step templates and instructions. Reasoning patterns increase token usage and latency—budget accordingly. Use traces/logging to inspect tool calls and reasoning loops. Provide verification steps (tests, oracles, or critics) for hard problems, and remember that even with Sequential Thinking, wrong answers can persist without good instructions and checks.

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

pdf, ePub, online

$47.99 $35.99

you save $12.00 (25%)

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

$47.99 $35.99

you save $12.00 (25%)

eBook

pdf, ePub, online

$47.99 $35.99

you save $12.00 (25%)

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more