1 Big picture: What are LLMs?

1.1 Generative AI in context

1.2 What you will learn

1.3 Introducing how LLMs work

1.4 What is intelligence, anyway?

1.5 How humans and machines represent language differently

1.6 Generative Pretrained Transformers and friends

1.7 Why LLMs perform so well

1.8 LLMs in action: The good, bad, and scary

2 Tokenizers: How large language models see the world

2.1 Tokens as numeric representations

2.2 Language models see only tokens

2.2.1 The tokenization process

2.2.2 Controlling vocabulary size in tokenization

2.2.3 Tokenization in detail

2.2.4 The risks of tokenization

2.3 Tokenization and LLM capabilities

2.3.1 LLMs are bad at word games

2.3.2 LLMs are challenged by mathematics

2.3.3 LLMs and language equity

2.4 Check your understanding

2.5 Tokenization in context

4 How LLMs learn

4.1 Gradient descent

4.1.1 What is a loss function?

4.1.2 What is gradient descent?

4.2 LLMs learn to mimic human text

4.2.1 LLM reward functions

4.3 LLMs and novel tasks

4.3.1 Failing to identify the correct task

4.3.2 LLMs cannot plan

4.4 If LLMs cannot extrapolate well, can I use them?

4.5 Is bigger better?

5 How do we constrain the behavior of LLMs?

5.1 Why do we want to constrain behavior?

5.1.1 Base models are not very usable

5.1.2 Not all model outputs are desirable

5.1.3 Some cases require specific formatting

5.2 Fine-tuning: The primary method of changing behavior

5.2.1 Supervised fine-tuning

5.2.2 Reinforcement learning from human feedback

5.2.3 Fine-tuning: The big picture

5.3 The mechanics of RLHF

5.3.1 Beginning with a naive RLHF

5.3.2 The quality reward model

5.3.3 The similar-but-different RLHF objective

5.4 Other factors in customizing LLM behavior

5.4.1 Altering training data

5.4.2 Altering base model training

5.4.3 Altering the outputs

5.5 Integrating LLMs into larger workflows

5.5.1 Customizing LLMs with retrieval augmented generation

5.5.2 General-purpose LLM programming

6 Beyond natural language processing

6.1 LLMs for software development

6.1.1 Improving LLMs to work with code

6.1.2 Validating code generated by LLMs

6.1.3 Improving code via formatting

6.2 LLMs for formal mathematics

6.2.1 Sanitized input

6.2.2 Helping LLMs understand numbers

6.2.3 Math LLMs also use tools

6.3 Transformers and computer vision

6.3.1 Converting images to patches and back

6.3.2 Multimodal models using images and text

6.3.3 Applicability of prior lessons

7 Misconceptions, limits, and eminent abilities of LLMs

7.1 Human rate of learning vs. LLMs

7.1.1 The limitations on self-improvement

7.1.2 Few-shot learning

7.2 Efficiency of work: A 10-watt human brain vs. a 2000-watt computer

7.2.1 Power

7.2.2 Latency, scalability, and availability

7.2.3 Refinement

7.3 Language models are not models of the world

7.4 Computational limits: Hard problems are still hard

7.4.1 Using fuzzy algorithms for fuzzy problems

7.4.2 When close enough is good enough for hard problems

8 Designing solutions with large language models

8.1 Just make a chatbot?

8.2 Automation bias

8.2.1 Changing the process

8.2.2 When things are too risky for autonomous LLMs

8.3 Using more than LLMs to reduce risk

8.3.1 Combining LLM embeddings with other tools

8.3.2 Designing a solution that uses embeddings

8.4 Technology presentation matters

8.4.1 How can you be transparent?

8.4.2 Aligning incentives with users

8.4.3 Incorporating feedback cycles

9 Ethics of building and using LLMs

9.1 Why did we build LLMs at all?

9.1.1 The pros and cons of LLMs doing everything

9.1.2 Do we want to automate all human work?

9.2 Do LLMs pose an existential risk?

9.2.1 Self-improvement and the iterative S-curve

9.2.2 The alignment problem

9.3 The ethics of data sourcing and reuse

9.3.1 What is fair use?

9.3.2 The challenges associated with compensating content creators

9.3.3 The limitations of public domain data

9.4 Ethical concerns with LLM outputs

9.4.1 Licensing implications for LLM output

9.4.2 Do LLM outputs poison the well?

9.5 Other explorations in LLM ethics

Overview

1 Big Picture, What is GPT?

Artificial intelligence has surged into everyday conversation, largely due to ChatGPT, and this chapter sets the stage by explaining what GPT is and why it matters. It introduces GPT as a Generative Pre-trained Transformer, situating it within the broader category of large language models that create new content based on patterns learned from vast text data. The authors aim to demystify how these systems work in plain language, clarifying what they can and cannot do and why. The discussion emphasizes that ChatGPT’s impact stems from scale—models that are far larger and trained on far more data than earlier approaches—enabling convincing conversation, summarization, and instruction following.

At a high level, GPTs are language models from the field of natural language processing that learn statistical relationships among words, not human-like understanding. They rely on neural networks—especially Transformers—to encode and generate text, and their effectiveness arises from massive datasets, enormous parameter counts, and specialized hardware. The chapter contrasts human, interactive language learning with the model’s static training process, stressing that the learned representation is powerful but fallible and steerable. A central lesson is that “bigger often beats clever”: with LLMs, performance gains frequently come more from more data and larger models than from intricate algorithmic tweaks. Training such systems is computationally expensive, so the book focuses on durable concepts rather than fast-changing implementation details.

The chapter also frames both the promise and the risks. GPTs can accelerate tasks like summarization, content creation, and dialogue, yet they can make confident mistakes, fail at seemingly simple problems, and be manipulated into harmful outputs. Responsible use requires skepticism, validation, and thoughtful system design that anticipates failure modes and ethical concerns. The book targets a broad audience with minimal coding and math prerequisites, offering the vocabulary and mental models needed to evaluate applications, understand limitations, and decide when to use or avoid LLMs. By the end, readers should be equipped to engage with GPTs’ capabilities and constraints in real-world contexts.

A simple Haiku generated by ChatGPT

Generative AI is about taking some input (numbers, text, images) and producing a new output (usually text or images). Any combination of input/output options is possible, and the nature of the output depends on what the algorithm was trained for. It could be to add detail, re-write something to be shorter, extrapolate missing portions, and more.

A high-level map of the various terms you’ll become familiar with and how they relate. Generative AI is a description of functionality: the function of generating content and using techniques from AI to accomplish that goal.

When you sign up for OpenAI’s ChatGPT, you have two options: the GPT-3.5 model that you can use for free or the GPT-4 model that costs money.

If the cleverness of an algorithm is based on how much information you encode into the design, older techniques often increase performance by being more clever than their predecessors. As reflected by the size of the circles, LLMs have mostly chosen a “dumber” approach of just using more data and parameters - imposing minimal constraints on what the algorithm can learn.

Summary

ChatGPT is one type of Large Language Model, which is itself in the larger family of Generative AI/ML. Generative models produce new output, and LLMs are unique in the quality of their output but are extremely costly to make and use.
ChatGPT is loosely patterned after an incomplete understanding of human brain function and language learning. This is used as inspiration in design, but it does not mean ChatGPT has the same abilities or weaknesses as humans.
Intelligence is a multi-faceted and hard-to-quantify concept, making it difficult to say if LLMs are intelligent. It is easier to think about LLMs and their potential use in terms of capabilities and reliability.
Human language must be converted to and from an LLM’s internal representation. How this representation is formed will change what an LLM learns and influence how you can build solutions using LLMs.

FAQ

What does GPT stand for, and what does each word imply?

GPT stands for Generative Pre-trained Transformer. Generative means the model produces new content (like text) based on patterns learned from data. Pre-trained indicates it is first trained on very large text collections before being adapted for use. Transformer is the core neural network component that makes modern LLMs possible; later chapters explain it in detail.

What is generative AI, and how does ChatGPT fit in?

Generative AI creates new media—text, images, audio, or video—by learning from vast amounts of prior data and feedback about desirable outputs. ChatGPT is a generative model focused primarily on text: given a prompt, it produces novel, human-like responses. Although centered on text, related systems can mix inputs and outputs across modalities.

How are AI, NLP, LLMs, Transformers, and ChatGPT related?

Artificial Intelligence is the broad field; Natural Language Processing (NLP) focuses on human language within AI. Large Language Models (LLMs) are NLP models trained on massive text corpora, and Transformers are the main building block of today’s LLMs. ChatGPT is a product built on an LLM that uses Transformer architecture.

Why are these models called “large,” and how large is ChatGPT?

They are “large” because they contain enormous numbers of learned parameters and are trained on huge datasets. ChatGPT is rumored to have about 1.76 trillion parameters—on the order of seven terabytes just to hold the model in memory—requiring many high-end GPUs and multi-machine infrastructure. By contrast, more conventional language models can be a few gigabytes.

Why do GPTs and other LLMs perform so well compared to older methods?

Recent gains come less from clever new tricks and more from scaling up data and model size. LLMs use relatively simple mechanisms to capture relationships in text, but at massive scale this “brute-force” approach outperforms many older, carefully engineered algorithms. The trade-off is that such models can be costly and complex to deploy.

How do humans and machines represent language differently?

Humans learn language interactively over time and use rich, innate and social cues; our understanding is flexible and context-driven. LLMs learn a static representation from large text datasets via neural networks that approximate patterns in language. Their representations can be powerful but are imperfect, and we can shape or constrain their behavior through design.

What can ChatGPT do well today?

It excels at tasks like dialogue, instruction following, summarization, question answering, and content creation. Its strong language representations enable outputs that often feel human-written. New applications continue to emerge as people rethink workflows around these capabilities.

What are the main limitations and risks when using ChatGPT?

LLMs can fail in surprising ways—even on simple tasks—and may produce confident but incorrect or unsafe outputs. Safety systems can sometimes be bypassed, and small error rates can scale to large harms with millions of users. Effective use demands skepticism, verification, and thoughtful design, along with attention to ethical and societal impacts.

Can I train my own LLM? What resources would I need?

Training a modern LLM is expensive and resource-intensive. Even modest efforts can cost ≥$100,000, and competing with state-of-the-art systems would require investments on the order of hundreds of millions of dollars plus significant infrastructure. Because tools evolve rapidly, this book emphasizes durable concepts over step-by-step training code.

Who is this book for, and what will I learn?

It targets a broad audience—technical and non-technical—assuming minimal coding and math background. You’ll gain a practical vocabulary, a big-picture understanding of how LLMs work, what they can and can’t do, and how to use or deploy them responsibly. The first half explains LLM mechanics and outputs; the second half focuses on human factors, risks, and ethics.

1 Big Picture, What is GPT?

A simple Haiku generated by ChatGPT

A high-level map of the various terms you’ll become familiar with and how they relate. Generative AI is a description of functionality: the function of generating content and using techniques from AI to accomplish that goal.

When you sign up for OpenAI’s ChatGPT, you have two options: the GPT-3.5 model that you can use for free or the GPT-4 model that costs money.

Summary

FAQ

pro $24.99 per month

lite $19.99 per month

team

pro $24.99 per month

lite $19.99 per month

team

pro

team

pro

team

pro

team