Introduction to Generative AI, Second Edition you own this product

Numa Dhamani and Maggie Engler

MEAP began June 2025
Last updated November 2025
Publication in February 2026 (estimated)

ISBN 9781633434882
325 pages (estimated)

Included with a Manning Online subscription

printed in black & white

catalog / Data Science / Deep Learning / Generative AI

table of content

1 Large language models: The foundation of generative AI

1.1 The evolution of natural language processing

1.2 The birth of LLMs

1.3 The explosion of LLMs

1.4 What are LLMs used for?

1.4.1 Language modeling

1.4.2 Question answering

1.4.3 Coding

1.4.4 Content generation

1.4.5 Logical reasoning

1.4.6 Other natural language tasks

1.5 Where do LLMs fall short?

1.5.1 Training data and bias

1.5.2 Limitations in controlling machine outputs

1.5.3 Sustainability of LLMs

1.6 Major players in generative AI

1.6.1 OpenAI

1.6.2 Google

1.6.3 Meta

1.6.4 Microsoft

1.6.5 Anthropic

1.6.6 Other notable players

1.7 Conclusion

1.8 Summary

2 Training large language models: Learning at scale

2.1 How are LLMs trained?

2.1.1 Exploring open web data collection

2.1.2 Demystifying autoregression and bidirectional token prediction

2.2 Training multimodal LLMs

2.3 Transferring knowledge for efficient models

2.4 Mixture-of-experts and sparse models

2.5 Reasoning models

2.6 Techniques for post-training LLMs

2.6.1 Supervised fine-tuning

2.6.2 Reinforcement learning from human feedback

2.6.3 Direct preference optimization

2.6.4 Reinforcement learning from AI feedback

2.7 Emergent properties of LLMs

2.7.1 Learning with a few examples

2.7.2 Is emergence an illusion?

2.8 Conclusion

2.9 Summary

3 Data privacy and safety: Technical and legal controls

3.1 What’s in the training data?

3.1.1 Encoding bias

3.1.2 Linguistic diversity

3.1.3 Sensitive information

3.2 Safety-focused improvements for LLM generations

3.2.1 Post-processing detection algorithms

3.2.2 Content filtering or conditional pre-training

3.2.3 Safety post-training

3.2.4 Machine unlearning

3.3 Navigating user privacy and commercial risks

3.3.1 Inadvertent data leakage

3.3.2 Best practices when interacting with LLMs

3.4 Data protection and privacy in the age of AI

3.4.1 International standards and data protection laws

3.4.2 Are generative AI systems GDPR-compliant?

3.4.3 Privacy regulations in academia

3.4.4 Corporate policies

3.4.5 Governing data in an AI-driven world

3.5 Conclusion

3.6 Summary

4 AI and the creative economy: Innovation and intellectual property

4.1 The rise of synthetic media

4.1.1 Techniques for creating synthetic media

4.1.2 The opportunities and risks of synthetic media

4.1.3 Detecting synthetic media

4.2 Transforming creative workflows

4.2.1 Marketing and media applications

4.2.2 Visual and digital art

4.2.3 Filmmaking

4.2.4 Music

4.3 Intellectual property in the LLM era

4.3.1 Copyright law and fair use

4.3.2 Open source and licenses

4.3.3 Creator’s rights and data licensing

4.4 Conclusion

4.5 Summary

5 Misuse and adversarial attacks: Challenges and responsible testing

5.1 Intentional misuse

5.1.1 Cybersecurity and social engineering

5.1.2 Illicit and harmful applications

5.1.3 Adversarial narratives

5.1.4 Political manipulation and electioneering

5.2 Hallucinations

5.2.1 Why do LLMs hallucinate?

5.2.2 Misuse of LLMs in the professional world

5.3 Red teaming LLMs

5.4 Conclusion

5.5 Summary

6 Machine-augmented work: Productivity, education, and economy

6.1 Using LLMs in the professional space

6.1.1 LLMs assisting doctors with administrative tasks

6.1.2 LLMs for legal research, discovery, and documentation

6.1.3 LLMs augmenting financial investing and bank customer service

6.1.4 LLMs as collaborators in creativity

6.2 LLMs as a programming partner

6.3 LLMs in daily life

6.4 Generative AI in education

6.5 Detecting machine-generated text

6.6 Generative AI and the labor market

6.7 Conclusion

6.8 Summary

7 Prompt engineering: Strategies for guiding and evaluating LLMs

7.1 What is prompt engineering?

7.2 Prompting techniques and frameworks

7.2.1 Overview of common prompting techniques

7.2.2 Structuring prompts to guide model behavior

7.2.3 Prompting frameworks for structured output

7.2.4 Evolving practices in prompt engineering

7.3 Evaluating AI-generated outputs

7.3.1 Identifying evaluation metrics

7.3.2 Assembling evaluation datasets

7.3.3 Scoring model responses

7.4 Prompting vs. post-training

7.5 Conclusion

7.6 Summary

8 AI agents: The rise of autonomous AI systems

8.1 What is an AI agent?

8.2 How are AI agents being used?

8.2.1 Personal assistants

8.2.2 Enterprise workflows

8.2.3 Research and discovery

8.2.4 Software development

8.2.5 Cybersecurity

8.2.6 Physical environments

8.2.7 Multi-agent systems

8.2.8 Toward agentic collaboration

8.3 How are AI agents trained and enabled?

8.3.1 Agent architectures

8.3.2 Retrieval-augmented generation

8.3.3 Model Context Protocol

8.3.4 GUI-native agents

8.3.5 Evaluating agents

8.4 Risks and considerations unique to agents

8.4.1 Autonomy and misalignment

8.4.2 Memory and state persistence

8.4.3 Tool access and real-world consequences

8.4.4 Emergent behaviors in multi-agent systems

8.4.5 Security and adversarial risks

8.4.6 Human factors and decision delegation

8.4.7 Evaluation, monitoring, and oversight

8.4.8 The road ahead

8.5 The future of AI agents

8.6 Conclusion

8.7 Summary

9 Human connections: The social role of chatbots

9.1 The rise of human–chatbot relationships

9.2 Why humans are turning to chatbots for relationships

9.2.1 The loneliness epidemic

9.2.2 Emotional attachment in human–chatbot relationships

9.3 The benefits and risks of human–chatbot relationships

9.4 Toward healthier human–chatbot relationships

9.5 Conclusion

9.6 Summary

10 The future of responsible AI: Risks, practices, and policy

10.1 Where are LLM developments headed?

10.1.1 Language as the universal interface

10.1.2 From tools to agentic systems

10.1.3 The rise of personalized AI

10.1.4 On the horizon

10.2 Sociotechnical risks of generative AI

10.2.1 Bias, toxicity, and representational harms

10.2.2 Hallucinations, fabrications, and epistemic harm

10.2.3 Autonomy and emergent agentic risks

10.2.4 Misuse across domains

10.2.5 Dependency, emotional harm, and relationship risks

10.2.6 Labor and economic disruption

10.2.7 A holistic view of harm

10.3 Best practices for responsible AI development and use

10.3.1 Curating datasets and standardizing documentation

10.3.2 Protecting data privacy

10.3.3 Explainability, transparency, and bias

10.3.4 Design interventions and architectures

10.3.5 Model training strategies for safety

10.3.6 Red teaming and evaluation

10.3.7 Detecting and tracing synthetic media

10.3.8 Platform responsibility and user safeguards

10.3.9 Humans in the loop

10.3.10 Education and digital literacy

10.3.11 Toward responsible generative AI

10.4 AI regulations in practice

10.4.1 The United States

10.4.2 The European Union

10.4.3 China

10.4.4 Corporate self-governance

10.5 Toward an AI governance framework

10.6 Conclusion

10.7 Summary

11 Frontiers of AI: Open questions and global trends

11.1 The quest for artificial general intelligence

11.2 AI sentience and consciousness

11.3 The carbon footprint of LLMs

11.4 The open source movement

11.5 Global investment in AI

11.6 Conclusion

11.7 Summary

Appendix

Appendix A: References

A.1 Chapter 1

A.2 Chapter 2

A.3 Chapter 3

A.4 Chapter 4

A.5 Chapter 5

A.6 Chapter 6

A.7 Chapter 7

A.8 Chapter 8

A.9 Chapter 9

A.10 Chapter 10

A.11 Chapter 11

Overview

1 Large language models: The foundation of generative AI

Large language models burst into public awareness with the release of ChatGPT, revealing how convincingly machines can converse, write, and reason across many domains. This chapter introduces LLMs as general-purpose systems that have rapidly reshaped natural language processing and begun influencing education, work, creativity, and communication. It sets expectations for both promise and pitfalls, arguing that a practical understanding of how these models function is essential to using them well and navigating the societal debates they provoke.

The narrative traces NLP’s evolution from brittle rule-based systems to data-driven statistical methods and then to deep learning, culminating in transformers—models built on attention that capture long-range context efficiently and at scale. Pretraining on vast unlabeled corpora followed by fine-tuning enabled GPT, BERT, and successors to generalize across tasks through next-token prediction and self-supervision. As a result, LLMs power a wide array of applications: dialogue and language modeling, question answering and reading comprehension, translation and summarization, coding assistance, content generation, and emerging forms of mathematical and scientific reasoning. Their flexibility, multimodality, and capacity gains have unlocked capabilities once considered out of reach for machine language systems.

The chapter also examines where LLMs fall short and how the ecosystem is responding. Training data can embed and amplify social biases; fluent outputs can contain confident falsehoods (hallucinations); and the energy, cost, and compute concentration raise sustainability and access concerns. Industry strategies diverge: rapid capability scaling and multimodal releases (OpenAI), foundational research and product integration (Google), open-access model families (Meta), enterprise-wide assistants built on partnerships (Microsoft), and safety-centered alignment approaches (Anthropic), alongside rising players like DeepSeek, Mistral, Cohere, Perplexity, Stability, Midjourney, and Runway. The chapter closes with a balanced outlook: progress is accelerating and transformative, but realizing its benefits responsibly will require sustained attention to privacy, bias, accountability, and safety.

The reinforcement learning cycle

The distribution of attention for the word “it” in different contexts.

A timeline of breakthrough events in NLP.

Representation of word embeddings in the vector space

Summary

The history of NLP is as old as computers themselves. The first application that sparked interest in NLP was machine translation in the 1950s, which was also the first commercial application released by Google in 2006.
Transformer models and the debut of the attention mechanism were the biggest NLP breakthroughs of the decade. The attention mechanism attempts to mimic attention in the human brain by placing “importance” on the most relevant information.
The boom in NLP from the late 2010s to early 2020s is due to the increasing availability of text data from around the internet and the development of powerful computational resources. This marked the beginning of the LLM.
Today’s LLMs are trained primarily with self-supervised learning on large volumes of text from the web and are then fine-tuned with reinforcement learning.
GPT, released by OpenAI, was one of the first general-purpose LLMs designed for use with any natural language task. These models can be fine-tuned for specific tasks and are especially well-suited for text-generation applications, such as chatbots.
LLMs are versatile and can be applied to various applications and use cases, including text generation, answering questions, coding, logical reasoning, content generation, and more. Of course, there are also inherent risks, such as encoding bias, hallucinations, and emission of sizable carbon footprints.
In January 2023, OpenAI’s ChatGPT set a record for the fastest-growing user base in history and set off an AI arms race in the tech industry to develop and release LLM-based conversational dialogue agents. As of 2025, the most significant LLMs have come from OpenAI, Google, Meta, Microsoft, and Anthropic.

FAQ

What is a large language model (LLM), and why did ChatGPT’s launch matter?

LLMs are neural network models—typically transformer-based—that learn to predict the next token in context using massive text corpora. ChatGPT’s public release in late 2022 let anyone converse with such a model, showcasing capabilities like explanation, drafting, and coding, and catalyzing mainstream awareness and adoption, even though it was the product of steady progress rather than a single breakthrough.

How do transformers and the attention mechanism work?

Attention lets a model weigh the relevance of different tokens when generating or interpreting a sequence. Transformers rely on self-attention to capture long-range dependencies while enabling parallel computation, which dramatically improved speed and performance over prior sequence models and set new records in tasks like machine translation.

How has NLP evolved from early systems to LLMs?

- Rule-based era: hand-crafted grammars and heuristics (brittle and labor-intensive).
- Statistical era: data-driven methods using parallel corpora and probabilistic models.
- Neural/deep learning era: large neural networks trained on vast data; transformers enabled today’s LLMs.

What kinds of machine learning do NLP systems use?

- Supervised learning: maps labeled inputs to outputs (e.g., translation pairs).
- Unsupervised/self-supervised learning: learns patterns from unlabeled text (e.g., next-token prediction, masked tokens).
- Reinforcement learning: optimizes behavior via rewards/penalties; modern LLMs often combine approaches during training and alignment.

What are pretraining, fine-tuning, and tokenization in LLMs?

- Pretraining: models learn general language patterns from large unlabeled text.
- Fine-tuning: adapting a pretrained model to a specific task or style with smaller, task-focused data.
- Tokenization: splitting text into tokens (words/subwords) so models can encode inputs and decode outputs.

What are the most common applications of LLMs discussed in the chapter?

- Language modeling and text generation (chat, autocomplete, style control).
- Question answering (extractive, open-book generative, closed-book).
- Coding assistance (suggestions, scaffolding, comments-to-code).
- Content generation (articles, marketing copy, emails, social posts).
- Logical and commonsense reasoning (math, science, multi-step problems).
- Machine translation and text summarization (extractive and abstractive).

What are hallucinations, and why do LLMs produce them?

Hallucinations are fluent but incorrect statements. They stem from predictive text generation (not grounded understanding), imperfections or gaps in training data, and adversarial or ambiguous prompts. As outputs get longer, the space of possible continuations grows, making strict factuality harder to guarantee without added safeguards.

How do bias and training data quality affect LLM behavior?

Training data drawn from the web can include harmful language, stereotypes, and historical inequities. Models internalize these patterns, leading to disparate outputs across identity attributes (e.g., gender, race). Even earlier word-embedding work showed such stereotypes; mitigating them in large models remains challenging and imperfect.

What are the costs and sustainability concerns of LLMs?

Training and serving LLMs require significant compute (GPUs/TPUs), money, and energy, with associated carbon emissions. Inference at scale can rival or exceed training energy use. These demands advantage large firms with data centers, spurring efforts toward efficiency, smaller/open models, and techniques that reduce compute while preserving capabilities.

Who are the major players, and how do their approaches differ?

- OpenAI: rapid multimodal advancement (GPT-4/4o, Sora, o1), strong Microsoft partnership.
- Google: foundational transformer research; Gemini and ecosystem integration with a principle-led posture.
- Meta: open-access strategy (Llama family) enabling on-device and researcher use.
- Microsoft: broad “Copilot” product integration; early chatbot lessons and enterprise focus.
- Anthropic: safety-forward “Constitutional AI” and Claude series.
- Others: DeepSeek (efficiency/MoE), Cohere (enterprise), Perplexity (AI search), Mistral (efficient open models), xAI/Grok, plus image/video leaders like Midjourney, Stability AI, and Runway.

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

pdf, ePub, online

$55.99 $27.99

you save $28.00 (50%)

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

$55.99 $27.99

you save $28.00 (50%)

eBook

pdf, ePub, online

$55.99 $27.99

you save $28.00 (50%)

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more