Overview

1 Introduction to AI Agents and Applications

Large language models have moved from novelty to necessity, enabling applications that understand, generate, and act on natural language—and unlocking a new class of systems: AI agents. This chapter introduces the core problems in building LLM-powered software and motivates the use of frameworks like LangChain, LangGraph, and LangSmith to replace boilerplate with proven patterns. It sets the stage by outlining the main application families—engines, chatbots, and agents—and the foundational techniques you will use throughout the book, especially prompt engineering and retrieval-augmented generation (RAG).

At a high level, the architecture centers on ingesting data into Documents, splitting text into chunks, embedding those chunks, and storing them in vector databases for fast semantic retrieval. Prompts combine user intent with retrieved context and are sent to an LLM or chat model, with output parsers shaping structured results and optional caching reducing cost. LangChain’s design emphasizes modularity, composability, and extensibility: standardized interfaces, the Runnable API, and the LangChain Expression Language (LCEL) let you build reliable chains and, when needed, shift to graph-shaped flows with LangGraph. Beyond vector search, retrievers can tap relational or graph stores, and the ecosystem supports evaluation, debugging, and monitoring via LangSmith.

The chapter surveys practical patterns across three app types. Engines provide targeted capabilities such as summarization and Q&A, typically implemented with RAG across ingestion and query phases. Chatbots add safe, guided conversation with role-based prompting, memory, and domain grounding. Agents represent the most advanced class: they plan and execute multi-step workflows, select and invoke tools and APIs, integrate heterogeneous data, and can include human-in-the-loop controls; emerging standards like MCP further streamline tool integration. To adapt models to real needs, the chapter compares prompt engineering, RAG, and fine-tuning, and offers guidance on model selection (purpose, context window, speed, size, multilingual, instruction vs reasoning, and open vs proprietary). Finally, it previews a hands-on path to design, build, evaluate, and scale production-grade systems with LangChain and LangGraph, cultivating transferable skills and best practices.

LangChain architecture: The Document Loader imports data, which the Text Splitter divides into chunks. These are vectorized by an Embedding Model, stored in a Vector Store, and retrieved through a Retriever for the LLM. The LLM Cache checks for prior requests to return cached responses, while the Output Parser formats the LLM's final response.
Object model of classes associated with the Document core entity, including Document loaders (which create Document objects), splitters (which create a list of Document objects), vector stores (which store Document objects in vector stores) and retrievers (which retrieve Document objects from vector stores and other sources)
Object model of classes associated with Language Models, including Prompt Templates and Prompt Values
A summarization engine efficiently summarizes and stores content from large volumes of text and can be invoked by other systems through REST API.
A Q&A engine implemented with RAG design: an LLM query engine stores domain-specific document information in a vector store. When an external system sends a query, it converts the natural language question into its embeddings (or vector) representation, retrieves the related documents from the vector store, and then gives the LLM the information it needs to craft a natural language response.
A summarization chatbot has some similarities with a summarization engine, but it offers an interactive experience where the LLM and the user can work together to fine-tune and improve the results.
Sequence diagram that outlines how a user interacts with an LLM through a chatbot to create a more concise summary.
Workflow of an AI agent tasked with assembling holiday packages: An external client system sends a customized holiday request in natural language. The agent prompts the LLM to select tools and formulate queries in technology-specific formats. The agent executes the queries, gathers the results, and sends them back to the LLM, along with the original request, to obtain a comprehensive holiday package summary. Finally, the agent forwards the summarized package to the client system.
A collection of documents is split into text chunks and transformed into vector-based embeddings. Both text chunks and related embeddings are then stored in a vector store.

Summary

  • LLMs have rapidly evolved into core building blocks for modern applications, enabling tasks like summarization, semantic search, and conversational assistants.
  • Without frameworks, teams often reinvent the wheel—managing ingestion, embeddings, retrieval, and orchestration with brittle, one-off code. LangChain addresses this by standardizing these patterns into modular, reusable components.
  • LangChain’s modular architecture builds on loaders, splitters, embedding models, retrievers, and vector stores, making it straightforward to build engines such as summarization and Q&A systems.
  • Conversational use cases demand more than static pipelines. LLM-based chatbots extend engines with dialogue management and memory, allowing adaptive, multi-turn interactions.
  • Beyond chatbots, AI agents represent the most advanced type of LLM application. Agents orchestrate multi-step workflows and tools under LLM guidance, with frameworks like LangGraph designed to make this practical and maintainable.
  • Retrieval-Augmented Generation (RAG) is a foundational pattern that grounds LLM outputs in external knowledge, improving accuracy while reducing hallucinations and token costs.
  • Prompt engineering remains a critical skill for shaping LLM behavior, but when prompts alone aren’t enough, RAG or even fine-tuning can extend capabilities further.

FAQ

What are the core challenges in building LLM-powered applications?Common hurdles include: ingesting and managing proprietary data; structuring and maintaining prompts; chaining model calls reliably; integrating external APIs and services; handling context-window limits and token costs; orchestrating multi-step workflows; and evaluating, debugging, and monitoring apps once deployed.
How does LangChain address these challenges?LangChain standardizes recurring patterns into modular, composable components (document loaders, text splitters, embedding models, vector stores, retrievers, prompt templates, LLM cache, output parsers). With the Runnable interface and LangChain Expression Language (LCEL), you can chain components consistently, reducing glue code and improving maintainability. Its design principles—modularity, composability, extensibility—let you swap models, stores, and connectors without rewrites.
What is the typical LangChain data flow and architecture?Data is loaded and wrapped as Document objects, split into chunks, embedded into vectors, and stored in a vector store. At query time, a retriever fetches relevant chunks, a prompt template combines the user query with retrieved context, the LLM/ChatModel generates an answer (optionally using an LLM cache), and an output parser structures the result (e.g., JSON). Graph databases can complement vector stores when relationships matter.
What’s the difference between a chain and an agent?A chain is a fixed, sequential workflow tailored to a task (e.g., load → retrieve → prompt → LLM → parse). An agent manages a dynamic workflow: it selects tools at runtime, branches based on intermediate results, and iterates until the goal is met. Tools (plugins) form a toolkit the agent can choose from. LangGraph helps implement these adaptive, graph-shaped flows.
What is Retrieval-Augmented Generation (RAG) and why use it?RAG augments generation with retrieved context from a local knowledge base (often a vector store). Ingestion: load, split, embed, and index documents. Query: embed the question, retrieve similar chunks, and include them in the prompt. Benefits: efficiency (shorter prompts, lower cost), accuracy (grounded answers, fewer hallucinations), and flexibility (swap embeddings, retrievers, or stores as needed).
How do engines, chatbots, and AI agents differ?- Engines: backend capabilities (e.g., summarization, Q&A) exposed via APIs; great for system-to-system tasks and RAG-powered search. - Chatbots: conversational interfaces with role-based prompts and memory; can ground answers via retrieval. - Agents: autonomous or semi-autonomous systems that plan and execute multi-step tasks across tools and data sources, returning synthesized results.
What are LCEL and the Runnable interface, and why do they matter?Runnable is a common interface that lets components compose cleanly. LCEL is an expression language to wire components together declaratively. Together, they reduce boilerplate, make pipelines consistent, and simplify testing, debugging, and maintenance.
How do knowledge graphs and LangGraph fit into the picture?Knowledge graphs (e.g., Neo4j) represent entities and relationships, complementing vector stores when relational reasoning is needed. LangChain integrates graph databases and supports graph-based memory/planning. LangGraph formalizes graph-shaped workflows and provides agent/orchestrator classes to build robust multi-step, branching applications.
How should I choose an LLM for my application?Consider: task purpose (general vs specialized like code); context-window size; multilingual support; model size vs cost/latency; speed requirements; instruction vs reasoning models; and open-source vs proprietary trade-offs. Many systems mix models (e.g., smaller models for routing/summarization, larger or reasoning models for synthesis or planning).
How can I evaluate, monitor, and keep LLM apps reliable and safe?Use LangSmith for evaluation, debugging, and monitoring. Employ prompt guardrails, validators, and source citation to improve grounding. Include human-in-the-loop for high-stakes actions. For tool use at scale, adopt the Model Context Protocol (MCP) to standardize tool integration, and orchestrate multi-step flows with LangGraph for control and observability.

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


choose your plan

team

monthly
annual
$49.99
$399.99
only $33.33 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • AI Agents and Applications ebook for free
choose your plan

team

monthly
annual
$49.99
$399.99
only $33.33 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • AI Agents and Applications ebook for free