AI Agents for Offensive Security you own this product

Understanding AI-powered attacks and how to stop them

Mark Foudy

MEAP began April 2026
Last updated June 2026
Publication in Early 2027 (estimated)

ISBN 9781633434172
300 pages (estimated)

Included with a Manning Online subscription

printed in black & white

catalog / Data Science / AI / AI Agents

resources: Source code Book forum Source code on Github

table of content

PART 1: FOUNDATIONS OF AI-DRIVEN OFFENSIVE SECURITY

1 Threat-modeling agentic pipelines

1.1 Offensive Security

1.1.1 Traditional offensive security workflows

1.1.2 Best practices for offensive security

1.1.3 Why Tools Alone Are No Longer Enough

1.2 Large language models (LLMs) as security tools

1.2.1 Intelligence as a Component

1.3 Introducing AI Agents: Reasoning That Acts

1.3.1 What Is An AI Agent?

1.3.2 Why agents matter in offensive security

1.3.3 Agentic Offensive-Security Workflow

1.4 Pipelines: Information Routing Systems

1.4.1 Pipelines as the Organizing Principle

1.4.2 From Autonomy to Architecture

1.4.3 Why pipelines matter

1.5 Artifacts provide precise decision-making in offensive security

1.5.1 Artifacts as the Unit of Movement

1.5.2 Where Intelligence Enters the System

1.6 Why This Matters to You

1.6.1 Penetration Testers and Red Teams

1.6.2 Purple Teams and Detection Engineers

1.6.3 Blue Teams and SOC Analysts

1.7 Summary

2 Building your first AI agent

2.1 Limitations of script-based automation

2.2 What is an agent?

2.3 Anatomy of an agent

2.3.1 Core components

2.3.2 The ReAct agent loop

2.3.3 The scope and responsibility of an agent

2.4 The minimal agent specification

2.5 Building the minimal agent

2.5.1 What this agent does

2.5.2 Messages and observations

2.5.3 Tools

2.5.4 Tool A: extract URLs

2.5.5 Tool B: summarize URLs

2.5.6 Artifacts and logging

2.5.7 The minimal agent loop

2.5.8 Putting it together

2.6 Safety and governance

2.6.1 Why safety matters

2.6.2 Building safety gates

2.6.3 Sandboxing and isolation

2.6.4 Comprehensive logging

2.6.5 Operational policies

2.6.6 Implementing kill switches

2.6.7 Safety and governance summary

2.7 The triage agent

2.7.1 What is triage?

2.7.2 nmap output

2.7.3 Agent decision scope and constraints

2.7.4 Triage artifacts

2.7.5 Practical benefits and safety constraints

2.8 Summary

3 Multi-agent pipelines and orchestration

3.1 Why multi-agent systems?

3.2 Multi-agent mental model

3.3 Artifacts as agent interfaces

3.4 Artifact provenance, replay, and auditability

3.5 Agent orchestration and execution control

3.5.1 What the orchestrator does

3.5.2 3.6.2 Why do we need an orchestrator?

3.6 Shared state and memory boundaries

3.7 Safety gates and authorization control

3.7.1 Applying safety gates in multi-agent systems

3.7.2 Human in the loop

3.8 Error handling and resilience

3.8.1 Retry logic and exponential backoff

3.8.2 Checkpointing

3.8.3 Error classes and mitigation strategies

3.8.4 Monitoring, metrics, and alerts

3.8.5 Building confidence in AI agents through resilience

3.9 Metric visualization and auditing AI agents

3.9.1 Generating simple traces

3.9.2 Graph-based views

3.9.3 Correlating artifacts and logs

3.9.4 Defense through transparency

3.9.5 Practical visualization tips

3.10 Reconnaissance multi-agent pipeline

3.10.1 Pipeline overview

3.10.2 ReconNormalizeAgent

3.10.3 TriageAgent

3.10.4 ReportAgent

3.11 Failure modes in multi-agent systems

3.11.1 Silent artifact drift

3.11.2 Orchestration hidden inside agents

3.11.3 Safety gates treated as formalities

3.11.4 Memory accumulation without correction

3.11.5 Overloaded agents

3.12 Summary

PART 2: RECONNAISSANCE PIPELINES

4 Passive reconnaissance agents

4.1 Why an AI reconnaissance agent?

4.2 The minimal agent architecture

4.2.1 Pipelines vs. scripts

4.2.2 The four building blocks to building ReAct agents

4.2.3 Artifacts provide persistent memory to an agent

4.2.4 Safety gates

4.2.5 Two scripts, one story

4.3 Building the minimal AI recon pipeline

4.3.1 Getting started

4.3.2 Starting the artifact helper

4.3.3 Creating the passive reconnaissance pipeline

4.3.4 Diving into the passive reconnaissance results

4.4 Inspecting your reconnaissance results

4.4.1 Viewing results

4.4.2 Checking for patterns

4.4.3 Clean up and prepare for next steps

4.5 Reading the pipeline like a story

4.5.1 Artifacts as episodes of reasoning

4.5.2 Example: following the trail

4.6 Safety, scope, and ethics

4.6.1 Scope defines the battlefield

4.6.2 Gates keep humans in the loop

4.6.3 Handling and storing data safely

4.6.4 AI models and responsibility

4.7 Summary

PART 3: VULNERABILITY DISCOVERY & EXPLOITATION PIPELINES

5 AI-assisted vulnerability discovery

6 Agent-assisted exploitation strategies

PART 4: OPERATIONAL SYNTHESIS & PURPLE TEAMING

7 Intelligent report generation

8 Orchestrating multi-agent workflows

9 The purple-team loop & detection engineering

Appendices

Appendix A: Building your AI lab

Appendix B: Self-hosting & deployment

Appendix C: Hacking the prompt — Advanced prompt engineering for attackers

Overview

1 Threat-modeling agentic pipelines

This chapter introduces AI agent pipelines as a structured way to bring large language models and autonomous reasoning into offensive security without losing control, accountability, or human judgment. It explains that modern testers already rely on chains of tools and artifacts, but the growing volume and complexity of scan results, logs, endpoints, and vulnerabilities make manual coordination increasingly fragile. AI can help by interpreting context, prioritizing findings, reducing cognitive load, and adapting strategy, but it must be embedded in disciplined workflows rather than used as ungoverned automation.

The chapter distinguishes between traditional automation, LLMs, agents, pipelines, and artifacts. Traditional tools execute repeatable actions, while LLMs provide probabilistic reasoning and interpretation. Agents extend LLMs by adding goals, tools, memory, knowledge, and orchestration so they can plan and act within limits. Pipelines then provide the architecture that makes agent behavior repeatable and auditable: they collect trusted inputs, interpret context, run controlled actions, evaluate results, and report findings. Artifacts such as open ports, resolved subdomains, HTTP metadata, endpoints, and error messages become the evidence that moves through this system and drives precise decisions.

A central theme is that AI should support offensive security professionals, not replace them. Humans define objectives, approve escalation, enforce scope, and remain responsible for ethical and legal boundaries. The chapter emphasizes best practices such as written authorization, avoiding production harm, protecting sensitive data, documenting actions, maintaining human oversight, and following responsible disclosure. For bug hunters, red teams, purple teams, blue teams, SOC analysts, and security leaders, AI pipelines offer faster triage, reusable playbooks, stronger detection feedback, safer training, and clearer accountability, turning creative security testing into a measurable and repeatable practice.

The conventional triage pipeline mental model.This diagram provides a high-level (macro) view of the conventional, human-driven security triage pipeline, as sketched in the notebook. It serves as a roadmap for this linear workflow, starting with data collection and proceeding sequentially through vulnerability assessment, risk scoring, and attack path planning. This entire, predictable sequence traditionally concludes with a human operator making a final decision or handoff.

Seven best practices for offensive security. 1) Authorized scope only: to ensure we do not cause damage or expose sensitive information, we need to work within pre-defined, scoped boundaries. 2) No production harm: do not do things that will negatively affect production traffic. 3) Follow the law: we must stay compliant and follow regulations. 4) Human oversight: humans must be in the loop to review and validate findings. 5) Protect sensitive data: proper processes must be set up to ensure personal and identifiable information is not exposed. 6) Document everything: we must store logs, traces, and other artifacts that will allow us to audit our systems. 7) Reasonable disclosure: We should give the affected party a reasonable amount of time to fix issues before publicly revealing them. These best practices ensure that offensive security teams consistently deliver value and maintain professionalism within their organization.

An illustrative example of how an LLM generates the next token. As the LLM generates a sentence, it considers the context of the previous words that were generated. The LLM takes in the last token and assesses the probability of the next token. In the example above, since green has the highest logit value, it is the next word to be generated in the sentence.

An overview of AI agent systems. AI agents consist of 4 components that are orchestrated together to produce an outcome: 1) the model, which is a foundation model, 2) tools, which are functions that the LLM can use to interact with the world (e.g., custom functions, APIs, MCP servers), 3) memory, where previous interactions are stored either in the context window or in a vector database, and 4) the knowledgebase, where additional context (documents, old conversations, etc) are stored in a vector database.

The Dynamic AI Agent System Mental Model. This diagram models the more dynamic system that results from introducing an AI Agent, representing the technology and its surrounding world. Unlike the linear sequence in Model 1, the central Agent creates a cyclical, event-driven workflow that allows it to initiate reconnaissance, penetration testing, or triage in response to new data. This model provides a framework for understanding the complex, parallel interactions and feedback loops unique to the AI-driven system. A reader can use this model to predict the AI's behavior or debug its emergent actions.

An example: reconnaissance agent pipeline. An AI agent system consists of 1) a data pipeline that feeds an LLM logs and other inputs from the system, 2) a reasoning component that allows AI models to determine appropriate actions and steps, 3) an evaluation component that assesses the impact of the changes, and 4) a reporting system for the security professionals.1.4.2. Core Components of an AI Security Pipeline

Summary

Large language models (LLMs) introduce contextual reasoning to security testing, turning raw data into actionable intelligence when guided by skilled professionals.
Because LLMs are probabilistic systems, their outputs can be unreliable without validation; human oversight is essential to ensure accuracy and safety.
AI agents build on LLMs by adding memory, planning, and tool-use capabilities, enabling reasoning systems that can act rather than merely respond.
Pipelines provide the structure agents need to remain reliable and accountable—defining clear stages for input, reasoning, action, evaluation, and reporting.
AI agent pipelines allow offensive security teams to scale intelligence without losing control—empowering individuals, red teams, and CISOs alike to achieve measurable, repeatable outcomes.

FAQ

What is the main idea of “Threat-modeling agentic pipelines”?

The chapter explains how AI reshapes offensive security by introducing agentic pipelines: structured workflows where AI agents interpret artifacts, prioritize work, and support decisions while humans define objectives and approve escalation. The goal is not uncontrolled automation, but repeatable, auditable, and ethical security testing.

How do AI agents differ from traditional offensive security tools?

Traditional tools execute specific tasks, apply rules, and produce output. AI agents add reasoning, planning, memory, and tool use. Instead of only running commands, an agent can interpret results, decide what matters, refine a strategy, and suggest or trigger controlled next steps within a defined pipeline.

Why are tools alone no longer enough in modern offensive security?

Modern environments generate too much data for humans to interpret manually at scale. Scanners, logs, headers, certificates, and reconnaissance outputs can overwhelm analysts. The limiting factor is no longer execution, but understanding: deciding what matters, what to do next, and why. AI agents help reduce this cognitive burden by reasoning across results and prioritizing effort.

What is an artifact in an offensive security pipeline?

An artifact is a structured record of what a tool observed, such as open ports, resolved subdomains, HTTP response metadata, discovered endpoints, authentication behavior, error messages, or scan results. Artifacts are evidence, not interpretation. Pipelines move artifacts forward so agents or humans can decide which ones matter.

What role do pipelines play in AI-assisted offensive security?

Pipelines define how information moves from one stage of testing to the next. They route artifacts, enforce decision gates, log actions, and make workflows reproducible. In this model, tools execute, agents interpret, and pipelines govern the flow of information.

How does an LLM become an AI agent?

An LLM becomes part of an AI agent when it is combined with orchestration, tools, memory, goals, and sometimes a knowledge base. The LLM provides reasoning and language generation, while the agent system determines next steps, calls tools, evaluates results, and loops until a goal is met or a limit is reached.

What are the core components of an AI security pipeline?

A well-designed AI security pipeline typically collects trusted input from tools, logs, or APIs; interprets context using an LLM or agent; acts through controlled tools or scripts; evaluates results against rules or safety checks; and reports findings in a structured format. Each stage should be monitored, logged, and reviewable.

Why is human oversight important in agentic offensive security workflows?

Human oversight ensures that AI-assisted testing remains ethical, legal, and safe. Humans define objectives, approve escalation, validate findings, and review actions that could cause data loss, service disruption, or privacy violations. AI can accelerate analysis, but it should not replace professional judgment.

What are the key safety practices for offensive security testing with AI?

The chapter emphasizes several practices: stay within authorized written scope, avoid harming production systems, follow applicable laws, require human oversight for high-risk actions, protect sensitive data, document prompts and decisions, and follow responsible disclosure when real vulnerabilities are found.

How do AI agent pipelines benefit different security roles?

Bug hunters can use pipelines to triage data and draft reports faster. Red teams and penetration testers can turn engagements into reusable playbooks. Purple teams and detection engineers can use offensive traces to improve defenses. Blue teams and SOC analysts can replay recorded runs in safe environments. Security leaders gain visibility, accountability, and auditable evidence of testing outcomes.

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

pdf, ePub, online

$55.99 $27.99

you save $28.00 (50%)

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

$55.99 $27.99

you save $28.00 (50%)

eBook

pdf, ePub, online

$55.99 $27.99

you save $28.00 (50%)

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more