Deep Learning with Python
François Chollet
  • November 2017
  • ISBN 9781617294433
  • 384 pages
  • printed in black & white

The clearest explanation of deep learning I have come was a joy to read.

Richard Tobias, Cephasonics

Deep Learning with Python introduces the field of deep learning using the Python language and the powerful Keras library. Written by Keras creator and Google AI researcher François Chollet, this book builds your understanding through intuitive explanations and practical examples.

Listen to this book in liveAudio! liveAudio integrates a professional voice recording with the book’s text, graphics, code, and exercises in Manning’s exclusive liveBook online reader. Use the text to search and navigate the audio, or download the audio-only recording for portable offline listening. You can purchase or upgrade to liveAudio here or in liveBook.

About the Technology

Machine learning has made remarkable progress in recent years. We went from near-unusable speech and image recognition, to near-human accuracy. We went from machines that couldn't beat a serious Go player, to defeating a world champion. Behind this progress is deep learning—a combination of engineering advances, best practices, and theory that enables a wealth of previously impossible smart applications.

About the book

Deep Learning with Python introduces the field of deep learning using the Python language and the powerful Keras library. Written by Keras creator and Google AI researcher François Chollet, this book builds your understanding through intuitive explanations and practical examples. You'll explore challenging concepts and practice with applications in computer vision, natural-language processing, and generative models. By the time you finish, you'll have the knowledge and hands-on skills to apply deep learning in your own projects.

Table of Contents detailed table of contents

Part 1: The fundamentals of Deep Learning

1. What is Deep Learning?

1.1. Artificial intelligence, machine learning and deep learning

1.1.1. Artificial intelligence

1.1.2. Machine Learning

1.1.3. Learning representations from data

1.1.4. The "deep" in deep learning

1.1.5. Understanding how deep learning works in three figures

1.1.6. What deep learning has achieved so far

1.1.7. Don’t believe the short-term hype

1.1.8. The promise of AI

1.2. Before deep learning: a brief history of machine learning

1.2.1. Probabilistic modeling

1.2.2. Early neural networks

1.2.3. Kernel methods

1.2.4. Decision trees, random forests, and gradient boosting machines

1.2.5. Back to neural networks

1.2.6. What makes deep learning different

1.2.7. The modern machine-learning landscape

1.3. Why deep learning, why now?

1.3.1. Hardware

1.3.2. Data

1.3.3. Algorithms

1.3.4. A new wave of investment

1.3.5. The democratization of deep learning

1.3.6. Will it last?

2. Before we start: the mathematical blocks of neural networks

2.1. A first look at a neural network

2.2. Data representations for neural networks

2.2.1. Scalars (0D tensors)

2.2.2. Vectors (1D tensors)

2.2.3. Matrices (2D tensors)

2.2.4. 3D tensors and higher-dimensional tensors

2.2.5. Key attributes

2.2.6. Manipulating tensors in Numpy

2.2.7. The notion of data batch

2.2.8. Real-world examples of data tensors

2.2.9. Vector data

2.2.10. Timeseries data or sequence data

2.2.11. Image data

2.2.12. Video data

2.3. The gears of neural networks: tensor operations

2.3.1. Element-wise operations

2.3.2. Broadcasting

2.3.3. Tensor dot

2.3.4. Tensor reshaping

2.3.5. Geometric interpretation of tensor operations

2.3.6. A geometric interpretation of deep learning

2.4. The engine of neural networks: gradient-based optimization

2.4.1. What's a derivative?

2.4.2. Derivative of a tensor operation: the gradient

2.4.3. Stochastic gradient descent

2.4.4. Chaining derivatives: the backpropagation algorithm

2.5. Looking back on our first example

3. Getting started with neural networks

3.1. Anatomy of a neural network

3.1.1. Layers: the Lego bricks of deep learning

3.1.2. Models: networks of layers

3.1.3. Loss functions and optimizers: keys to configuring the learning process

3.2. Introduction to Keras

3.2.1. Keras, TensorFlow, Theano, and CNTK

3.2.2. Developing with Keras: a quick overview

3.3. Setting up a deep learning workstation

3.3.1. Preliminary considerations

3.3.2. Jupyter notebooks: the prefered way to run deep learning experiments

3.3.3. Getting Keras running: two options

3.3.4. Running deep learning jobs in the cloud: pros and cons

3.3.5. What is the best GPU for deep learning?

3.4. Classifying movie reviews: a binary classification example

3.4.1. The IMDB dataset

3.4.2. Preparing the data

3.4.3. Building our network

3.4.4. Validating our approach

3.4.5. Using a trained network to generate predictions on new data

3.4.6. Further experiments

3.4.7. Wrapping up

3.5. Classifying newswires: a multi-class classification example

3.5.1. The Reuters dataset

3.5.2. Preparing the data

3.5.3. Building our network

3.5.4. Validating our approach

3.5.5. Generating predictions on new data

3.5.6. A different way to handle the labels and the loss

3.5.7. On the importance of having sufficiently large intermediate layers

3.5.8. Further experiments

3.5.9. Wrapping up

3.6. Predicting house prices: a regression example

3.6.1. The Boston Housing Price dataset

3.6.2. Preparing the data

3.6.3. Building our network

3.6.4. Validating our approach using K-fold validation

3.6.5. Wrapping up

4. Fundamentals of machine learning

4.1. Four different brands of machine learning

4.1.1. Supervised learning

4.1.2. Unsupervised learning

4.1.3. Self-supervised learning

4.1.4. Reinforcement learning

4.2. Evaluating machine learning models

4.2.1. Training, validation, and test sets

4.2.2. Things to keep in mind

4.3. Data preprocessing, feature engineering and feature learning

4.3.1. Data preprocessing for neural networks

4.3.2. Feature engineering

4.4. Overfitting and underfitting

4.4.1. Reducing the network size

4.4.2. Adding weight regularization

4.4.3. Adding dropout

4.5. The universal workflow of machine learning

4.5.1. Define the problem and assemble a dataset

4.5.2. Pick a measure of success

4.5.3. Decide on an evaluation protocol

4.5.4. Prepare your data

4.5.5. Develop a model that does better than a baseline

4.5.6. Scale up: develop a model that overfits

4.5.7. Regularize your model and tune your hyperparameters

Part 2: Deep learning in practice

5. Deep learning for computer vision

5.1. Introduction to convnets

5.1.1. The convolution operation

5.1.2. The max pooling operation

5.2. Training a convnet from scratch on a small dataset

5.2.1. The relevance of deep learning for small-data problems

5.2.2. Downloading the data

5.2.3. Building our network

5.2.4. Data preprocessing

5.2.5. Using data augmentation

5.3. Using a pre-trained convnet

5.3.1. Feature extraction

5.3.2. Fine-tuning

5.3.3. Wrapping up

5.4. Visualizing what convnets learn

5.4.1. Visualizing intermediate activations

5.4.2. Visualizing convnet filters

5.4.3. Visualizing heatmaps of class activation

6. Deep learning for text and sequences

6.1. Working with text data

6.1.1. One-hot encoding of words or characters

6.1.2. Using word embeddings

6.1.3. Putting it all together: from raw text to word embeddings

6.1.4. Wrapping up

6.2. Understanding recurrent neural networks

6.2.1. A first recurrent layer in Keras

6.2.2. Understanding the LSTM and GRU layers

6.2.3. A concrete LSTM example in Keras

6.2.4. Wrapping up

6.3. Advanced usage of recurrent neural networks

6.3.1. A temperature forecasting problem

6.3.2. Preparing the data

6.3.3. A common sense, non-machine learning baseline

6.3.4. A basic machine learning approach

6.3.5. A first recurrent baseline

6.3.6. Using recurrent dropout to fight overfitting

6.3.7. Stacking recurrent layers

6.3.8. Using bidirectional RNNs

6.3.9. Going even further

6.3.10. Wrapping up

6.4. Sequence processing with convnets

6.4.1. Understanding 1D convolution for sequence data

6.4.2. 1D Pooling for sequence data

6.4.3. Implementing a 1D convnet

6.4.4. Combining CNNs and RNNs to process long sequences

6.4.5. Wrapping up

7. Advanced deep learning best practices

7.1. Going beyond the Sequential model: the Keras functional API

7.1.1. Introduction to the functional API

7.1.2. Multi-input models

7.1.3. Multi-output models

7.1.4. Directed acyclic graphs of layers

7.1.5. Layer weight sharing

7.1.6. Models as layers

7.1.7. Wrapping up

7.2. Inspecting and monitoring deep learning models: using Keras callbacks and TensorBoard

7.2.1. Using callbacks to act on a model during training

7.2.2. Introduction to TensorBoard: the TensorFlow visualization framework

7.2.3. Wrapping up

7.3. Getting the most out of your models

7.3.1. Advanced architecture patterns

7.3.2. Hyperparameter optimization

7.3.3. Model ensembling

7.3.4. Wrapping up

8. Generative deep learning

8.1. Text generation with LSTM

8.1.1. A brief history of generative recurrent networks

8.1.2. How can we generate sequence data?

8.1.3. The importance of the sampling strategy

8.1.4. Implementing character-level LSTM text generation

8.1.5. Wrapping up

8.2. Deep Dream

8.2.1. Implementing Deep Dream in Keras

8.2.2. Wrapping up

8.3. Neural style transfer

8.3.1. The content loss

8.3.2. The style loss

8.3.3. Neural style transfer in Keras

8.3.4. Wrapping up

8.4. Generating images with Variational Autoencoders

8.4.1. Sampling from latent spaces of images

8.4.2. Concept vectors for image editing

8.4.3. Variational autoencoders

8.4.4. Wrapping up

8.5. Introduction to generative adversarial networks

8.5.1. A schematic GAN implementation

8.5.2. A bag of tricks

8.5.3. The generator

8.5.4. The discriminator

8.5.5. The adversarial network

8.5.6. How to train your DCGAN

8.5.7. Wrapping up

9. Conclusions

9.1. Key concepts in review

9.1.1. Different brands of approaches to AI

9.1.2. What makes deep learning special within machine learning

9.1.3. How to think about deep learning

9.1.4. Key enabling technologies

9.1.5. The universal machine learning workflow

9.1.6. Key network architectures

9.1.7. The space of possibilities

9.1.8. Mapping image data to vector data

9.1.9. Mapping timeseries data to vector data

9.2. The limitations of deep learning

9.2.1. The risk of anthropomorphizing machine learning models

9.2.2. Local generalization versus extreme generalization

9.2.3. Take-aways

9.3. The future of deep learning

9.3.1. Models as programs

9.3.2. Beyond backpropagation and differentiable layers

9.3.3. Automated machine learning

9.3.4. Lifelong learning and modular subroutine reuse

9.3.5. In summary: the long-term vision

9.4. Staying up to date in a fast-moving field

9.4.1. Practice on real-world problems using Kaggle

9.4.2. Read about the latest developments on Arxiv

9.4.3. Explore the Keras ecosystem

9.5. Final words


Appendix A: Installing Keras and its dependencies on Ubuntu

A.1. Installing the Python scientific suite

A.2. Setting up GPU support

A.3. Installing Theano (optional)

A.4. Installing Keras

Appendix B: Running Jupyter notebooks on an EC2 GPU instance

B.1. What are Jupyter notebooks? Why run Jupyter notebooks on AWS GPUs?

B.2. Why would you not want to use Jupyter on AWS for deep learning?

B.3. Setting up an AWS GPU instance

B.3.1. Configuring Jupyter

B.4. Installing Keras

B.5. Setting up local port forwarding

B.6. Using Jupyter from your local browser

What's inside

  • Deep learning from first principles
  • Setting up your own deep-learning environment
  • Image-classification models
  • Deep learning for text and sequences
  • Neural style transfer, text generation, and image generation

About the reader

Readers need intermediate Python skills. No previous experience with Keras, TensorFlow, or machine learning is required.

About the author

François Chollet works on deep learning at Google in Mountain View, CA. He is the creator of the Keras deep-learning library, as well as a contributor to the TensorFlow machine-learning framework. He also does deep-learning research, with a focus on computer vision and the application of machine learning to formal reasoning. His papers have been published at major conferences in the field, including the Conference on Computer Vision and Pattern Recognition (CVPR), the Conference and Workshop on Neural Information Processing Systems (NIPS), the International Conference on Learning Representations (ICLR), and others.

placing your order...

Don't refresh or navigate away from the page.
print book $37.49 $49.99 pBook + eBook + liveBook
Additional shipping charges may apply
Deep Learning with Python (print book) added to cart
continue shopping
go to cart

eBook $29.99 $39.99 3 formats + liveBook
Deep Learning with Python (eBook) added to cart
continue shopping
go to cart

Add liveAudio for only $19.99
Prices displayed in rupees will be charged in USD when you check out.
customers also reading

This book 1-hop 2-hops 3-hops

FREE domestic shipping on three or more pBooks