Overview

1 Threat-modeling agentic pipelines

This chapter explains how modern AI reshapes offensive security by adding reasoning and structure to work that was once linear, manual, and brittle. Where traditional pipelines moved from data collection to a one-time decision, agentic systems bring contextual judgment to each stage, turning vast, noisy outputs into prioritized, reproducible next steps. The focus is on artifacts—the concrete, structured findings produced by tools—and on disciplined routing of those artifacts through controlled, auditable pipelines. AI serves as decision support: humans define objectives and approve escalation; tools still execute; agents reduce cognitive load by interpreting results, correlating weak signals, and proposing strategy shifts as environments change.

The chapter demystifies the technical building blocks behind that shift. Large language models are positioned as probabilistic reasoning engines that can summarize, explain, and triage—but also hallucinate and remain non-deterministic, requiring human oversight. Agents extend LLMs with orchestration, tools, memory, and a knowledge base so they can plan, act, and adapt toward goals, not just answer prompts. Pipelines then provide the governing architecture: they formalize intake, interpretation, controlled action, evaluation, and reporting, with safety gates and logging at each hop. Treating artifacts as first-class objects separates evidence from judgment, yields clarity (tools execute, agents interpret, pipelines route), and replaces ad-hoc stitching with reproducible, accountable operations.

Ethical and operational guardrails are integral: test only within authorized scope, avoid production harm, follow the law, keep humans in the loop for risky actions, protect sensitive data, document everything, and practice responsible disclosure. With these boundaries, agentic pipelines scale precision and consistency across roles: solo hunters gain throughput and better write-ups; red teams capture engagements as reusable playbooks; blue and purple teams convert offensive traces into richer detections and training; leaders get measurable, auditable evidence of scope, controls, and outcomes. In short, the chapter shows how to move from clever prompts to dependable systems—turning creative exploration into controlled, data-driven security practice that preserves human judgment while amplifying impact.

The conventional triage pipeline mental model.This diagram provides a high-level (macro) view of the conventional, human-driven security triage pipeline, as sketched in the notebook. It serves as a roadmap for this linear workflow, starting with data collection and proceeding sequentially through vulnerability assessment, risk scoring, and attack path planning. This entire, predictable sequence traditionally concludes with a human operator making a final decision or handoff.
Seven best practices for offensive security. 1) Authorized scope only: to ensure we do not cause damage or expose sensitive information, we need to work within pre-defined, scoped boundaries. 2) No production harm: do not do things that will negatively affect production traffic. 3) Follow the law: we must stay compliant and follow regulations. 4) Human oversight: humans must be in the loop to review and validate findings. 5) Protect sensitive data: proper processes must be set up to ensure personal and identifiable information is not exposed. 6) Document everything: we must store logs, traces, and other artifacts that will allow us to audit our systems. 7) Reasonable disclosure: We should give the affected party a reasonable amount of time to fix issues before publicly revealing them. These best practices ensure that offensive security teams consistently deliver value and maintain professionalism within their organization.
An illustrative example of how an LLM generates the next token. As the LLM generates a sentence, it considers the context of the previous words that were generated. The LLM takes in the last token and assesses the probability of the next token. In the example above, since green has the highest logit value, it is the next word to be generated in the sentence.
An overview of AI agent systems. AI agents consist of 4 components that are orchestrated together to produce an outcome: 1) the model, which is a foundation model, 2) tools, which are functions that the LLM can use to interact with the world (e.g., custom functions, APIs, MCP servers), 3) memory, where previous interactions are stored either in the context window or in a vector database, and 4) the knowledgebase, where additional context (documents, old conversations, etc) are stored in a vector database.
The Dynamic AI Agent System Mental Model. This diagram models the more dynamic system that results from introducing an AI Agent, representing the technology and its surrounding world. Unlike the linear sequence in Model 1, the central Agent creates a cyclical, event-driven workflow that allows it to initiate reconnaissance, penetration testing, or triage in response to new data. This model provides a framework for understanding the complex, parallel interactions and feedback loops unique to the AI-driven system. A reader can use this model to predict the AI's behavior or debug its emergent actions.
An example: reconnaissance agent pipeline. An AI agent system consists of 1) a data pipeline that feeds an LLM logs and other inputs from the system, 2) a reasoning component that allows AI models to determine appropriate actions and steps, 3) an evaluation component that assesses the impact of the changes, and 4) a reporting system for the security professionals.1.4.2. Core Components of an AI Security Pipeline

Summary

  • Large language models (LLMs) introduce contextual reasoning to security testing, turning raw data into actionable intelligence when guided by skilled professionals.
  • Because LLMs are probabilistic systems, their outputs can be unreliable without validation; human oversight is essential to ensure accuracy and safety.
  • AI agents build on LLMs by adding memory, planning, and tool-use capabilities, enabling reasoning systems that can act rather than merely respond.
  • Pipelines provide the structure agents need to remain reliable and accountable—defining clear stages for input, reasoning, action, evaluation, and reporting.
  • AI agent pipelines allow offensive security teams to scale intelligence without losing control—empowering individuals, red teams, and CISOs alike to achieve measurable, repeatable outcomes.

FAQ

What problem do AI agent pipelines solve in offensive security?They close the gap between fast execution and informed decision-making. By formalizing how information moves from reconnaissance to action, pipelines turn AI-assisted insights into controlled, repeatable, and auditable workflows—reducing chaos while preserving human oversight.
How are AI agents different from traditional security automation?Traditional tooling runs predefined commands and rules; AI agents add reasoning, planning, and adaptation. Agents use a model plus tools, memory, and goals to iteratively choose next steps, evaluate results, and adjust strategies—without replacing existing scanners or exploits.
What are “artifacts” and why are they central to the pipeline?Artifacts are durable records of observations (for example, open ports, response headers, discovered endpoints, or authentication behaviors). Pipelines route these artifacts from one stage to the next, while agents decide which artifacts matter—separating raw evidence from interpretation and keeping decisions traceable.
Where do large language models (LLMs) fit, and what are their limitations?LLMs provide contextual reasoning—summarizing outputs, prioritizing findings, and explaining what results mean. They are probabilistic and can hallucinate or vary across runs, so teams should treat them as powerful but fallible components with human review, safety checks, and clean data inputs.
How do pipelines keep offensive testing ethical and contained?They embed decision gates and approvals, enforce scoped boundaries, and generate comprehensive logs. Combined with best practices—authorized scope, no production harm, legal compliance, human oversight, data protection, documentation, and responsible disclosure—pipelines help ensure safe, professional operations.
How does an agentic workflow differ from the classic pentest pipeline?The classic flow is linear and deterministic (collect → assess → score → plan → decide). Agentic workflows are cyclical and event-driven: an agent initiates tasks, triages results, adapts strategy in real time, and loops—bringing feedback and reasoning into each step.
What are the core components of an AI agent system?An agent combines: 1) a model for reasoning, 2) tools for taking actions, 3) memory to retain context, and 4) a knowledge base to ground decisions—coordinated by orchestration logic that plans and sequences the next step toward a goal.
Why use pipelines instead of ad‑hoc scripts?Pipelines bring speed, standardization, and scale; they support continuous monitoring, reduce costs through automation, and double as living documentation. They also add clarity (tools execute, agents interpret, pipelines route), scalability (fewer manual handoffs), and accountability (every decision tied to an artifact).
Who benefits from AI agent pipelines, and how?Solo researchers gain force-multiplying triage and reporting; red teams capture repeatable playbooks; purple/detection teams reuse offensive traces to tune defenses; blue/SOC teams safely replay scenarios; leaders get measurable, auditable evidence of scope, controls, and outcomes.
What does a reconnaissance pipeline look like in practice?Data flows in from trusted tools and logs, an agent interprets context, controlled actions run against selected targets, results are evaluated against rules or safety checks, and findings are reported. For example, web-relevant services from a port scan advance to HTTP probing explicitly, with human approvals gating any risky steps.

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • AI Agents for Offensive Security ebook for free
choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • AI Agents for Offensive Security ebook for free