Overview

1 Introduction to agents and their world

This chapter introduces AI agents as software entities that act on a user’s behalf, bridging traditional notions from reinforcement learning with practical assistants woven into everyday applications. It outlines a spectrum of interactions with large language models, ranging from direct use to proxy assistants, tool-using agents that execute functions with user approval, and fully autonomous agents capable of planning and decision-making. It also motivates multi-agent systems in which specialized profiles collaborate—often through a coordinating controller—to divide work, provide mutual feedback, and reduce errors, highlighting why agents are becoming central to modern AI workflows.

The chapter decomposes agents into core components that can be mixed and matched for different goals. Profiles and personas (often defined by a system prompt) anchor an agent’s role, tone, and capabilities; actions and tool use translate intent into task execution, exploration, or communication; knowledge and memory structures surface the right context efficiently; reasoning and evaluation help agents think through problems and assess outputs; and planning and feedback mechanisms organize steps toward goals, with or without human oversight. Planning can follow a single path or explore multiple strategies, and external planners or other agents can orchestrate complex workflows. Memory can be unified or hybrid, spanning documents, databases, embeddings, and lightweight lists. These components benefit both non-autonomous and autonomous agents, enabling specialization, reliability, and scalable collaboration.

With the limits of prompt engineering exposed by real-world iteration, agent systems emerged to embed planning, evaluation, and repetition into the problem-solving loop, exemplifying why structured agent workflows outperform ad hoc prompting on complex tasks. Trust, guardrails, and clear goals remain essential, so many production tools prioritize supervised, non-autonomous designs while still delivering meaningful automation. At the same time, a broader software shift is underway: data and applications are increasingly exposed through natural language interfaces that agents can consume, enabling more intuitive, accurate, and integrated solutions. The result is a rapidly evolving landscape of frameworks and patterns, and this chapter equips readers with the concepts needed to navigate and build effective agent systems.

The differences between the LLM interactions from direct action to proxy agents, agents, and autonomous agents
In this example of a multi-agent system, the controller or agent proxy communicates directly with the user. Two agents—a coder and a tester—work in the background to create code and write unit tests to test the code.
The five main components of a single-agent system (image generated through DallE-3)
An in-depth look at how we will explore creating agent profiles
The aspects of agent actions we will explore in this book
Exploring the role and use of agent memory and knowledge
a The reasoning and evaluation component and details
b Exploring the role of agent planning and reasoning
The original design of the AutoGPT agent system
A vision of how agents will interact with software systems

Summary

  • An agent is an entity that acts or exerts power, produces an effect or serves as a means for achieving a result. An agent automates interaction with a large language model (LLM) in AI.
  • An assistant is synonymous with an agent. Both terms encompass tools like OpenAI’s GPT Assistants.
  • Autonomous agents can make independent decisions, and their distinction from non-autonomous agents is crucial.
  • The four main types of LLM interactions include direct user interaction, agent/assistant proxy, agent/assistant, and autonomous agent.
  • Multi-agent systems involve agent profiles working together, often controlled by a proxy, to accomplish complex tasks.
  • The main components of an agent include the profile/persona, actions, knowledge/memory, reasoning/evaluation, and planning/feedback.
  • Agent profiles and personas guide an agent’s tasks, responses, and other nuances, often including background and demographics.
  • Actions and tools for agents can be manually generated, recalled from memory, or follow predefined plans.
  • Agents use knowledge and memory structures to optimize context and minimize token usage, utilizing various formats from documents to embeddings.
  • Reasoning and evaluation systems enable agents to think through problems and assess solutions using prompting patterns like zero-shot, few-shot, and chain-of-thought.
  • Planning/feedback components organize tasks to achieve goals, using single-path or multipath reasoning and integrating feedback from the environment and humans.
  • The rise of AI agents has introduced a new software development paradigm, shifting from traditional to natural language-based AI interfaces.
  • Understanding the progression and interaction of these tools helps develop agent systems, whether single, multiple, or autonomous.

FAQ

What does this book mean by an “AI agent” (and “assistant”)?An agent is an active system that acts on your behalf to achieve goals. In reinforcement learning it’s a decision-making learner; in software it’s an application that performs tasks for you. In this book, assistant and agent are used interchangeably, encompassing tools like GPT-based Assistants, whether or not they are fully autonomous.
How do direct LLM use, proxy agents, tool-using agents, and autonomous agents differ?- Direct interaction: you talk to the LLM with no intermediary.
- Proxy agent: an assistant reformulates your request for a target model or task (for example, crafting better prompts for image generation).
- Agent with tools: the LLM can call functions/plugins when you approve, then summarizes results back to you.
- Autonomous agent: plans, chooses tools, executes, and makes decisions with minimal oversight; may ask for feedback at milestones but operates independently.
What is a multi-agent system and why use one?A multi-agent system combines specialized agent “profiles” (personas) that collaborate. Benefits include parallel task execution, domain specialization, mutual feedback and evaluation to reduce errors, and flexibility to run autonomously or under human guidance (human-in-the-loop).
What are the main component categories of a single-agent system?Five recurring categories:
- Profile and persona: the agent’s role, background, instructions, and communication style.
- Actions and tool use: how the agent carries out tasks and interacts with external systems.
- Knowledge and memory: what the agent knows and recalls to stay within context limits.
- Reasoning and evaluation: thinking through options and judging outputs.
- Planning and feedback: organizing steps toward goals, with or without human or environmental feedback.
What is an agent profile/persona and how is it created?The profile (often a “system prompt”) anchors the agent’s identity and scope: role, goals, tone, constraints, and tools. It can be crafted by hand, refined with LLM assistance, or generated from data-driven methods (including evolutionary techniques), and may include background and demographic cues that shape responses.
How do agents take actions and use tools effectively?Agents:
- Target different aims: task completion, exploration, or communication.
- Consider impact: on the environment, task outcome, and their internal state/memory.
- Generate actions: manually from instructions, by recalling prior steps, or by following/adjusting a plan. Tool calls are actions the agent chooses to execute in pursuit of the goal.
How do knowledge and memory help agents work within context limits?Agents retrieve only the most relevant information to keep token usage low. Stores can be unified or hybrid, spanning formats like documents, databases, embeddings for semantic search, or simple lists. Effective retrieval and summarization let agents ground responses in pertinent facts without overloading context.
What roles do reasoning, evaluation, and planning play in agents?- Reasoning and evaluation let agents think through alternatives and judge outputs before responding.
- Planning can be autonomous or feedback-driven, adapting to changes and human input.
- Single-path planning proceeds step by step; multipath explores several strategies and preserves effective ones. External planners (code or other agents) may orchestrate larger workflows.
Why are agents rising now, and how should teams approach adoption and trust?Prompt engineering improved early LLM use but hit limits on complex goals. Systems like AutoGPT showed that planning, iteration, and repetition boost reliability on multifaceted tasks. Most production tools remain non-autonomous to build trust gradually. Start with scoped, tool-using agents, add guardrails and evaluation, gather feedback, and expand autonomy as confidence grows.
What is an “AI interface,” and how will it change software and data access?An AI interface exposes data and application capabilities through natural language, not just UIs, APIs, or SQL. It enables agents to query, invoke functions, and coordinate with other systems semantically. As these interfaces spread, many applications will become more agent-ready, improving task accuracy and enabling more trustworthy, increasingly autonomous workflows (though not every use case requires this model).

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


choose your plan

team

monthly
annual
$49.99
$399.99
only $33.33 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • AI Agents in Action ebook for free
choose your plan

team

monthly
annual
$49.99
$399.99
only $33.33 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • AI Agents in Action ebook for free
choose your plan

team

monthly
annual
$49.99
$399.99
only $33.33 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • AI Agents in Action ebook for free