Overview

1 Introduction to agents and their world

This chapter introduces AI agents as systems that act on a user’s behalf, ranging from simple assistants to autonomous decision-makers. It outlines several interaction modes with large language models: direct use, a proxy that reformulates requests, tool-using agents that seek approval before acting, and autonomous agents that plan and execute tasks with minimal supervision. It also highlights multi-agent systems, where specialized agent profiles collaborate, provide mutual evaluation, and run tasks in parallel to improve quality, speed, and reliability under varying degrees of human guidance.

The chapter then decomposes agents into core components that shape behavior and capability. Profiles and personas (often via a system prompt) establish role, tone, and objectives and can be crafted by humans, supported by LLMs, or derived from data-driven techniques. Actions and tool use span task execution, exploration, and communication, generated manually, from memory, or via plans. Knowledge and memory structures supply relevant context efficiently through formats like documents, databases, embeddings, or lists. Reasoning and evaluation augment workflows with deliberate thinking and quality checks, while planning and feedback coordinate steps toward goals using single-path or multipath strategies and, when useful, external planners. These components benefit all agent types, not only autonomous ones.

Finally, the chapter explains why agents have surged: early prompt engineering revealed the limits of one-shot interaction and the value of iteration, inspiring systems that loop through plan–act–evaluate cycles for complex goals. Trust remains a central concern, so many production-ready tools emphasize non-autonomous patterns with strong guardrails. Looking ahead, the chapter envisions AI interfaces that expose software and data through natural language, enabling agents to orchestrate services and information more directly and accurately. With new frameworks and tools emerging rapidly, readers are encouraged to use these conceptual foundations to navigate the evolving landscape and build practical, trustworthy agent solutions.

The differences between the LLM interactions from direct action to proxy agents, agents, and autonomous agents
In this example of a multi-agent system, the controller or agent proxy communicates directly with the user. Two agents—a coder and a tester—work in the background to create code and write unit tests to test the code.
The five main components of a single-agent system (image generated through DallE-3)
An in-depth look at how we will explore creating agent profiles
The aspects of agent actions we will explore in this book
Exploring the role and use of agent memory and knowledge
a The reasoning and evaluation component and details
b Exploring the role of agent planning and reasoning
The original design of the AutoGPT agent system
A vision of how agents will interact with software systems

Summary

  • An agent is an entity that acts or exerts power, produces an effect or serves as a means for achieving a result. An agent automates interaction with a large language model (LLM) in AI.
  • An assistant is synonymous with an agent. Both terms encompass tools like OpenAI’s GPT Assistants.
  • Autonomous agents can make independent decisions, and their distinction from non-autonomous agents is crucial.
  • The four main types of LLM interactions include direct user interaction, agent/assistant proxy, agent/assistant, and autonomous agent.
  • Multi-agent systems involve agent profiles working together, often controlled by a proxy, to accomplish complex tasks.
  • The main components of an agent include the profile/persona, actions, knowledge/memory, reasoning/evaluation, and planning/feedback.
  • Agent profiles and personas guide an agent’s tasks, responses, and other nuances, often including background and demographics.
  • Actions and tools for agents can be manually generated, recalled from memory, or follow predefined plans.
  • Agents use knowledge and memory structures to optimize context and minimize token usage, utilizing various formats from documents to embeddings.
  • Reasoning and evaluation systems enable agents to think through problems and assess solutions using prompting patterns like zero-shot, few-shot, and chain-of-thought.
  • Planning/feedback components organize tasks to achieve goals, using single-path or multipath reasoning and integrating feedback from the environment and humans.
  • The rise of AI agents has introduced a new software development paradigm, shifting from traditional to natural language-based AI interfaces.
  • Understanding the progression and interaction of these tools helps develop agent systems, whether single, multiple, or autonomous.

FAQ

How does this chapter define an AI “agent,” and how does it relate to an “assistant”?In this book, an agent is an entity that acts to achieve results on your behalf—something that can produce effects or serve as an instrument for a guiding intelligence. The term “assistant” is used synonymously with “agent.” OpenAI often prefers “assistant” to avoid the historical implication in machine learning that an “agent” is fully autonomous and self-deciding.
What are the main ways users can interact with LLMs and agents?The chapter outlines four patterns: - Direct interaction: you talk to the LLM with no intermediary. - Agent/assistant proxy: an LLM reformulates your request for a specific task (e.g., better prompts for image generation). - Agent/assistant with tools: the LLM can call plugins/functions (typically with your approval) and return results in natural language. - Autonomous agent: the agent plans and executes steps, makes decisions, and may seek feedback at milestones; it raises the strongest safety and ethics concerns.
What are multi-agent systems, and when are they useful?Multi-agent systems coordinate multiple specialized agent profiles (personas) to solve a problem. A common setup uses a controller/proxy that interacts with the user while worker agents (e.g., coder and tester) collaborate in the background. Benefits include specialization, parallel task execution, internal feedback and evaluation that reduce errors, and the option to operate autonomously or under human guidance.
What are the five core component systems of an agent described in this chapter?The chapter highlights five categories: - Profile and persona (the agent’s system prompt and role definition) - Actions and tool use (how the agent affects the world and completes tasks) - Knowledge and memory (how context is retrieved and organized) - Reasoning and evaluation (thinking through problems and judging solutions) - Planning and feedback (organizing steps toward goals, with or without feedback)
What is an agent’s profile/persona, and how is it created?The profile/persona is the base description that guides behavior, tone, and task approach (e.g., coder, writer, domain expert), and may include background and demographics. It can be crafted by humans, produced with LLM assistance, or generated using data-driven methods such as evolutionary algorithms. Techniques like rubrics and grounding help make profiles more effective and specific.
How do agent actions work—targets, impact, and generation methods?Agents act for task completion, exploration, or communication. They consider action impact on task outcomes, the environment, and their internal state. Actions can be generated manually, recalled from memory, or produced via predefined plans; tool use is part of this action space and helps drive task progress and learning.
How do agents use knowledge and memory effectively?Agents retrieve only the most relevant information to stay within token limits while maintaining context quality. Stores can be unified or hybrid, and formats include documents (e.g., PDFs), databases (relational, object, document), vector embeddings for semantic similarity search, or even simple lists functioning as memories.
What roles do reasoning, evaluation, and planning play in agent performance?Reasoning and evaluation annotate workflows so agents can think through problems and assess solutions. Planning can be done without feedback (autonomous) or with feedback from the environment or humans. Agents may use single-path (step-by-step) or multipath strategies (exploring alternatives and retaining efficient ones); external planners or other agents can orchestrate plans. Even non-autonomous agents benefit from planning.
Why are AI agents rising now, and what did early systems like AutoGPT demonstrate?Prompt engineering improved results but often required iteration and reflection. Early autonomous systems like AutoGPT showed that planning, iteration, and repetition help LLMs tackle complex, multi-step goals. Trust remains a central challenge—users must trust the agent’s decision process, guardrails/evaluation, and goal definition—so many production tools emphasize non-autonomous designs.
What is an AI interface, and how does it change software and data interaction?An AI interface is a natural-language layer of functions, tools, and data that exposes software and information for agents and users. It shifts interaction away from traditional UI/API/SQL towards semantic, language-based access, enabling agents to collect data and operate applications (including other agents) more accurately and autonomously. While not suited to every system, it will fit many high-value use cases.

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


choose your plan

team

monthly
annual
$49.99
$399.99
only $33.33 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • AI Agents in Action ebook for free
choose your plan

team

monthly
annual
$49.99
$399.99
only $33.33 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • AI Agents in Action ebook for free
choose your plan

team

monthly
annual
$49.99
$399.99
only $33.33 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • AI Agents in Action ebook for free