Deep Learning with PyTorch
Eli Stevens, Luca Antiga, and Thomas Viehmann
Foreword by Soumith Chintala
  • July 2020
  • ISBN 9781617295263
  • 520 pages
  • printed in black & white

With this publication, we finally have a definitive treatise on PyTorch. It covers the basics and abstractions in great detail.

From the Foreword by Soumith Chintala, Cocreator of PyTorch
Every other day we hear about new ways to put deep learning to good use: improved medical imaging, accurate credit card fraud detection, long range weather forecasting, and more. PyTorch puts these superpowers in your hands, providing a comfortable Python experience that gets you started quickly and then grows with you as you—and your deep learning skills—become more sophisticated. Deep Learning with PyTorch will make that journey engaging and fun.

About the Technology

Although many deep learning tools use Python, the PyTorch library is truly Pythonic. Instantly familiar to anyone who knows PyData tools like NumPy and scikit-learn, PyTorch simplifies deep learning without sacrificing advanced features. It's excellent for building quick models, and it scales smoothly from laptop to enterprise. Because companies like Apple, Facebook, and JPMorgan Chase rely on PyTorch, it's a great skill to have as you expand your career options. It's easy to get started with PyTorch. It minimizes cognitive overhead without sacrificing the access to advanced features, meaning you can focus on what matters the most - building and training the latest and greatest deep learning models and contribute to making a dent in the world. PyTorch is also a snap to scale and extend, and it partners well with other Python tooling. PyTorch has been adopted by hundreds of deep learning practitioners and several first-class players like FAIR, OpenAI, FastAI and Purdue.

About the book

Deep Learning with PyTorch teaches you to create neural networks and deep learning systems with PyTorch. This practical book quickly gets you to work building a real-world example from scratch: a tumor image classifier. Along the way, it covers best practices for the entire DL pipeline, including the PyTorch Tensor API, loading data in Python, monitoring training, and visualizing results. After covering the basics, the book will take you on a journey through larger projects. The centerpiece of the book is a neural network designed for cancer detection. You'll discover ways for training networks with limited inputs and start processing data to get some results. You'll sift through the unreliable initial results and focus on how to diagnose and fix the problems in your neural network. Finally, you'll look at ways to improve your results by training with augmented data, make improvements to the model architecture, and perform other fine tuning.
Table of Contents detailed table of contents

Part 1: Core PyTorch

1 Introducing deep learning and the PyTorch Library

1.1 The deep learning revolution

1.2 PyTorch for deep learning

1.3 Why PyTorch?

1.3.1 The deep learning competitive landscape

1.4 An overview of how PyTorch supports deep learning projects

1.5 Hardware and software requirements

1.5.1 Using Jupyter Notebooks

1.6 Exercises

1.7 Summary

2 Pretrained networks

2.1 A pretrained network that recognizes the subject of an image

2.1.1 Obtaining a pretrained network for image recognition

2.1.2 AlexNet

2.1.3 ResNet

2.1.4 Ready, set, almost run

2.1.5 Run!

2.2 A pretrained model that fakes it until it makes it

2.2.1 The GAN game

2.2.2 CycleGAN

2.2.3 A network that turns horses into zebras

2.3 A pretrained network that describes scenes

2.3.1 NeuralTalk2

2.4 Torch Hub

2.5 Conclusion

2.6 Exercises

2.7 Summary

3 It starts with a tensor

3.1 The world as floating-point numbers

3.2 Tensors: Multidimensional arrays

3.2.1 From Python lists to PyTorch tensors

3.2.2 Constructing our first tensors

3.2.3 The essence of tensors

3.3 Indexing tensors

3.4 Named tensors

3.5 Tensor element types

3.5.1 Specifying the numeric type with dtype

3.5.2 A dtype for every occasion

3.5.3 Managing a tensor’s dtype attribute

3.6 The tensor API

3.7 Tensors: Scenic views of storage

3.7.1 Indexing into storage

3.7.2 Modifying stored values: In-place operations

3.8 Tensor metadata: Size, offset, and stride

3.8.1 Views of another tensor’s storage

3.8.2 Transposing without copying

3.8.3 Transposing in higher dimensions

3.8.4 Contiguous tensors

3.9 Moving tensors to the GPU

3.9.1 Managing a tensor’s device attribute

3.10 NumPy interoperability

3.11 Generalized tensors are tensors, too

3.12 Serializing tensors

3.12.1 Serializing to HDF5 with h5py

3.13 Conclusion

3.14 Exercises

3.15 Summary

4 Real-world data representation using tensors

4.1 Working with images

4.1.1 Adding color channels

4.1.2 Loading an image file

4.1.3 Changing the layout

4.1.4 Normalizing the data

4.2 3D images: Volumetric data

4.2.1 Loading a specialized format

4.3 Representing tabular data

4.3.1 Using a real-world dataset

4.3.2 Loading a wine data tensor

4.3.3 Representing scores

4.3.4 One-hot encoding

4.3.5 When to categorize

4.3.6 Finding thresholds

4.4 Working with time series

4.4.1 Adding a time dimension

4.4.2 Shaping the data by time period

4.4.3 Ready for training

4.5 Representing text

4.5.1 Converting text to numbers

4.5.2 One-hot-encoding characters

4.5.3 One-hot encoding whole words

4.5.4 Text embeddings

4.5.5 Text embeddings as a blueprint

4.6 Conclusion

4.7 Exercises

4.8 Summary

5 The mechanics of learning

5.1 A timeless lesson in modeling

5.2 Learning is just parameter estimation

5.2.1 A hot problem

5.2.2 Gathering some data

5.2.3 Visualizing the data

5.2.4 Choosing a linear model as a first try

5.3 Less loss is what we want

5.3.1 From problem back to PyTorch

5.4 Down along the gradient

5.4.1 Decreasing loss

5.4.2 Getting analytical

5.4.3 Iterating to fit the model

5.4.4 Normalizing inputs

5.4.5 Visualizing (again)

5.5 PyTorch’s autograd: Backpropagating all things

5.5.1 Computing the gradient automatically

5.5.2 Optimizers a la carte

5.5.3 Training, validation, and overfitting

5.5.4 Autograd nits and switching it off

5.6 Conclusion

5.7 Exercise

5.8 Summary

6 Using a neural network to fit the data

6.1 Artificial neurons

6.1.1 Composing a multilayer network

6.1.2 Understanding the error function

6.1.3 All we need is activation

6.1.4 More activation functions

6.1.5 Choosing the best activation function

6.1.6 What learning means for a neural network

6.2 The PyTorch nn module

6.2.1 Using call rather than forward

6.2.2 Returning to the linear model

6.3 Finally a neural network

6.3.1 Replacing the linear model

6.3.2 Inspecting the parameters

6.3.3 Comparing to the linear model

6.4 Conclusion

6.5 Exercises

6.6 Summary

7 Telling birds from airplanes: Learning from images

7.1 A dataset of tiny images

7.1.1 Downloading CIFAR-10

7.1.2 The Dataset class

7.1.3 Dataset transforms

7.1.4 Normalizing data

7.2 Distinguishing birds from airplanes

7.2.1 Building the dataset

7.2.2 A fully connected model

7.2.3 Output of a classifier

7.2.4 Representing the output as probabilities

7.2.5 A loss for classifying

7.2.6 Training the classifier

7.2.7 The limits of going fully connected

7.3 Conclusion

7.4 Exercises

7.5 Summary

8 Using convolutions to generalize

8.1 The case for convolutions

8.1.1 What convolutions do

8.2 Convolutions in action

8.2.1 Padding the boundary

8.2.2 Detecting features with convolutions

8.2.3 Looking further with depth and pooling

8.2.4 Putting it all together for our network

8.3 Subclassing nn.Module

8.3.1 Our network as an nn.Module

8.3.2 How PyTorch keeps track of parameters and submodules

8.3.3 The functional API

8.4 Training our convnet

8.4.1 Measuring accuracy

8.4.2 Saving and loading our model

8.4.3 Training on the GPU

8.5 Model design

8.5.1 Adding memory capacity: Width

8.5.2 Helping our model to converge and generalize: Regularization

8.5.3 Going deeper to learn more complex structures: Depth

8.5.4 Comparing the designs from this section

8.5.5 It’s already outdated

8.6 Conclusion

8.7 Exercises

8.8 Summary

Part 2: Learning from images in the real world: Early detection of lung cancer

9 Using PyTorch to fight cancer

9.1 Introduction to the use case

9.2 Preparing for a large-scale project

9.3 What is a CT scan, exactly?

9.4 The project: An end-to-end detector for lung cancer

9.4.1 Why can’t we just throw data at a neural network until it works?

9.4.2 What is a nodule?

9.4.3 Our data source: The LUNA Grand Challenge

9.4.4 Downloading the LUNA data

9.5 Conclusion

9.6 Summary

10 Combining data sources into a unified dataset

10.1 Raw CT data files

10.2 Parsing LUNA’s annotation data

10.2.1 Training and validation sets

10.2.2 Unifying our annotation and candidate data

10.3 Loading individual CT scans

10.3.1 Hounsfield Units

10.4 Locating a nodule using the patient coordinate system

10.4.1 The patient coordinate system

10.4.2 CT scan shape and voxel sizes

10.4.3 Converting between millimeters and voxel addresses

10.4.4 Extracting a nodule from a CT scan

10.5 A straightforward dataset implementation

10.5.1 Caching candidate arrays with the getCtRawCandidate function

10.5.2 Constructing our dataset in LunaDataset.init

10.5.3 A training/validation split

10.5.4 Rendering the data

10.6 Conclusion

10.7 Exercises

10.8 Summary

11 Training a classification model to detect suspected tumors

11.1 A foundational model and training loop

11.2 The main entry point for our application

11.3 Pretraining setup and initialization

11.3.1 Initializing the model and optimizer

11.3.2 Care and feeding of data loaders

11.4 Our first-pass neural network design

11.4.1 The core convolutions

11.4.2 The full model

11.5 Training and validating the model

11.5.1 The computeBatchLoss function

11.5.2 The validation loop is similar

11.6 Outputting performance metrics

11.6.1 The logMetrics function

11.7 Running the training script

11.7.1 Needed data for training

11.7.2 Interlude: The enumerateWithEstimate function

11.8 Evaluating the model: Getting 99.7% correct means we’re done, right?

11.9 Graphing training metrics with TensorBoard

11.9.1 Running TensorBoard

11.9.2 Adding TensorBoard support to the metrics logging function

11.10 Why isn’t the model learning to detect nodules?

11.11 Conclusion

11.12 Exercises

11.13 Summary

12 Improving training with metrics and augmentation

12.1 High-level plan for improvement

12.2 Good dogs vs. bad guys: False positives and false negatives

12.3 Graphing the positives and negatives

12.3.1 Recall is Roxie’s strength

12.3.2 Precision is Preston’s forte

12.3.3 Implementing precision and recall in logMetrics

12.3.4 Our ultimate performance metric: The F1 score

12.3.5 How does our model perform with our new metrics?

12.4 What does an ideal dataset look like?

12.4.1 Making the data look less like the actual and more like the “ideal”

12.4.2 Contrasting training with a balanced LunaDataset to previous runs

12.4.3 Recognizing the symptoms of overfitting

12.5 Revisiting the problem of overfitting

12.5.1 An overfit face-to-age prediction model

12.6 Preventing overfitting with data augmentation

12.6.1 Specific data augmentation techniques

12.6.2 Seeing the improvement from data augmentation

12.7 Conclusion

12.8 Exercises

12.9 Summary

13 Using segmentationto find suspected nodules

13.1 Adding a second model to our project

13.2 Various types of segmentation

13.3 Semantic segmentation: Per-pixel classification

13.3.1 The U-Net architecture

13.4 Updating the model for segmentation

13.4.1 Adapting an off-the-shelf model to our project

13.5 Updating the dataset for segmentation

13.5.1 U-Net has very specific input size requirements

13.5.2 U-Net trade-offs for 3D vs. 2D data

13.5.3 Building the ground truth data

13.5.4 Implementing Luna2dSegmentationDataset

13.5.5 Designing our training and validation data

13.5.6 Implementing TrainingLuna2dSegmentationDataset

13.5.7 Augmenting on the GPU

13.6 Updating the training script for segmentation

13.6.1 Initializing our segmentation and augmentation models

13.6.2 Using the Adam optimizer

13.6.3 Dice loss

13.6.4 Getting images into TensorBoard

13.6.5 Updating our metrics logging

13.6.6 Saving our model

13.7 Results

13.8 Conclusion

13.9 Exercises

13.10 Summary

14 End-to-end nodule analysis, and where to go next

14.1 Towards the finish line

14.2 Independence of the validation set

14.3 Bridging CT segmentation and nodule candidate classification

14.3.1 Segmentation

14.3.2 Grouping voxels into nodule candidates

14.3.3 Did we find a nodule? Classification to reduce false positives

14.4 Quantitative validation

14.5 Predicting malignancy

14.5.1 Getting malignancy information

14.5.2 An area under the curve baseline: Classifying by diameter

14.5.3 Reusing preexisting weights: Fine-tuning

14.5.4 More output in TensorBoard

14.6 What we see when we diagnose

14.6.1 Training, validation, and test sets

14.7 What next? Additional sources of inspiration (and data)

14.7.1 Preventing overfitting: Better regularization

14.7.2 Refined training data

14.7.3 Competition results and research papers

14.8 Conclusion

14.8.1 Behind the curtain

14.9 Exercises

14.10 Summary

Part 3: Deployment

15 Deploying to production

15.1 Serving PyTorch models

15.1.1 Our model behind a Flask server

15.1.2 What we want from deployment

15.1.3 Request batching

15.2 Exporting models

15.2.1 Interoperability beyond PyTorch with ONNX

15.2.2 PyTorch’s own export: Tracing

15.2.3 Our server with a traced model

15.3 Interacting with the PyTorch JIT

15.3.1 What to expect from moving beyond classic Python/PyTorch

15.3.2 The dual nature of PyTorch as interface and backend

15.3.3 TorchScript

15.3.4 Scripting the gaps of traceability

15.4 LibTorch: PyTorch in C++

15.4.1 Running JITed models from C++

15.4.2 C from the start: The C API

15.5 Going mobile

15.5.1 Improving efficiency: Model design and quantization

15.6 Emerging technology: Enterprise serving of PyTorch models

15.7 Conclusion

15.8 Exercises

15.9 Summary

What's inside

  • Training deep neural networks
  • Implementing modules and loss functions
  • Utilizing pretrained models from PyTorch Hub
  • Exploring code samples in Jupyter Notebooks

About the reader

For Python programmers with an interest in machine learning.

About the authors

Eli Stevens had roles from software engineer to CTO, and is currently working on machine learning in the self-driving-car industry. Luca Antiga is cofounder of an AI engineering company and an AI tech startup, as well as a former PyTorch contributor. Thomas Viehmann is a PyTorch core developer and machine learning trainer and consultant. consultant based in Munich, Germany and a PyTorch core developer.
Deep Learning with PyTorch authors Luca Antiga (L) and Eli Stevenson (R) eating dessert in San Francisco's Mission District with the book's editor Frances Lefkowitz. Luca is from Bergamo, Italy, Eli lives in San Jose, and Frances hails from San Francisco.

placing your order...

Don't refresh or navigate away from the page.
print book $29.99 $49.99 pBook + eBook + liveBook
Additional shipping charges may apply
Deep Learning with PyTorch (print book) added to cart
continue shopping
go to cart

eBook $24.99 $39.99 3 formats + liveBook
Deep Learning with PyTorch (eBook) added to cart
continue shopping
go to cart

Prices displayed in rupees will be charged in USD when you check out.

FREE domestic shipping on three or more pBooks