An excellent introduction and overview of deep learning by a masterful teacher who guides, illuminates, and encourages you along the way.
Grokking Deep Learning teaches you to build deep learning neural networks from scratch! In his engaging style, seasoned deep learning expert Andrew Trask shows you the science under the hood, so you grok for yourself every detail of training neural networks.
1 introducing deep learning: why you should learn it
Why you should learn deep learning
Will this be difficult to learn?
Why you should read this book
What you need to get started
You’ll probably need some Python knowledge
Summary
2 fundamental concepts: how do machines learn?
What is deep learning?
Supervised machine learning
Unsupervised machine learning
Parametric vs. nonparametric learning
Supervised parametric learning
Unsupervised parametric learning
Nonparametric learning
Summary
3 introduction to neural prediction: forward propagation
Step 1: Predict
A simple neural network making a prediction
What is a neural network?
What does this neural network do?
Making a prediction with multiple inputs
Multiple inputs: What does this neural network do?
Multiple inputs: Complete runnable code
Making a prediction with multiple outputs
Predicting with multiple inputs and outputs
Multiple inputs and outputs: How does it work?
Predicting on predictions
A quick primer on NumPy
Summary
4 introduction to neural learning: gradient descent
Predict, compare, and learn
Compare
Learn
Compare: Does your network make
good predictions?
Why measure error?
What’s the simplest form of neural learning?
Hot and cold learning
Characteristics of hot and cold learning
Calculating both direction and amount from error
One iteration of gradient descent
Learning is just reducing error
Let’s watch several steps of learning
Why does this work? What is weight_delta, really?
Tunnel vision on one concept
A box with rods poking out of it
Derivatives: Take two
What you really need to know
What you don’t really need to know
How to use a derivative to learn
Look familiar?
Breaking gradient descent
Visualizing the overcorrections
Divergence
Introducing alpha
Alpha in code
Memorizing
5 learning multiple weights at a time: generalizing gradient descent
Gradient descent learning with multiple inputs
Gradient descent with multiple inputs explained
Let’s watch several steps of learning
Freezing one weight: What does it do?
Gradient descent learning with multiple outputs
Gradient descent with multiple inputs and outputs
What do these weights learn?
Visualizing weight values
Visualizing dot products (weighted sums)
Summary
6 building your first deep neural network: introduction to backpropagation
The streetlight problem
Preparing the data
Matrices and the matrix relationship
Creating a matrix or two in Python
Building a neural network
Learning the whole dataset
Full, batch, and stochastic gradient descent
Neural networks learn correlation
Up and down pressure
Edge case: Overfitting
Edge case: Conflicting pressure
Learning indirect correlation
Creating correlation
Stacking neural networks: A review
Backpropagation: Long-distance error attribution
Backpropagation: Why does this work?
Linear vs. nonlinear
Why the neural network still doesn’t work
The secret to sometimes correlation
A quick break
Your first deep neural network
Backpropagation in code
One iteration of backpropagation
Putting it all together
Why do deep networks matter?
7 how to picture neural networks: in your head and on paper
It’s time to simplify
Correlation summarization
The previously overcomplicated visualization
The simplified visualization
Simplifying even further
Let’s see this network predict
Visualizing using letters instead of pictures
Linking the variables
Everything side by side
The importance of visualization tools
8 learning signal and ignoring noise: introduction to regularization and batching
Three-layer network on MNIST
Well, that was easy
Memorization vs. generalization
Overfitting in neural networks
Where overfitting comes from
The simplest regularization: Early stopping
Industry standard regularization: Dropout
Why dropout works: Ensembling works
Dropout in code
Dropout evaluated on MNIST
Batch gradient descent
Summary
9 modeling probabilities and nonlinearities: activation functions
What is an activation function?
Standard hidden-layer activation functions
Standard output layer activation functions
The core issue: Inputs have similarity
softmax computation
Activation installation instructions
Multiplying delta by the slope
Converting output to slope (derivative)
Upgrading the MNIST network
10 neural learning about edges and corners: intro to convolutional neural networks
Reusing weights in multiple places
The convolutional layer
A simple implementation in NumPy
Summary
11 neural networks that understand language: king — man + woman == ?
What does it mean to understand language?
Natural language processing (NLP)
Supervised NLP
IMDB movie reviews dataset
Capturing word correlation in input data
Predicting movie reviews
Intro to an embedding layer
Interpreting the output
Neural architecture
Comparing word embeddings
What is the meaning of a neuron?
Filling in the blank
Meaning is derived from loss
King — Man + Woman ~= Queen
Word analogies
Summary
12 neural networks that write like Shakespeare: recurrent layers for variable-length data
The challenge of arbitrary length
Do comparisons really matter?
The surprising power of averaged word vectors
How is information stored in these embeddings?
How does a neural network use embeddings?
The limitations of bag-of-words vectors
Using identity vectors to sum word embeddings
Matrices that change absolutely nothing
Learning the transition matrices
Learning to create useful sentence vectors
Forward propagation in Python
How do you backpropagate into this?
Let’s train it!
Setting things up
Forward propagation with arbitrary length
Backpropagation with arbitrary length
Weight update with arbitrary length
Execution and output analysis
Summary
13 introducing automatic optimization: let’s build a deep learning framework
What is a deep learning framework?
Introduction to tensors
Introduction to automatic gradient computation (autograd)
A quick checkpoint
Tensors that are used multiple times
Upgrading autograd to support multiuse tensors
How does addition backpropagation work?
Adding support for negation
Adding support for additional functions
Using autograd to train a neural network
Adding automatic optimization
Adding support for layer types
Layers that contain layers
Loss-function layers
How to learn a framework
Nonlinearity layers
The embedding layer
Adding indexing to autograd
The embedding layer (revisited)
The cross-entropy layer
The recurrent neural network layer
Summary
14 learning to write like Shakespeare: long short-term memory
Character language modeling
The need for truncated backpropagation
Truncated backpropagation
A sample of the output
Vanishing and exploding gradients
A toy example of RNN backpropagation
Long short-term memory (LSTM) cells
Some intuition about LSTM gates
The long short-term memory layer
Upgrading the character language model
Training the LSTM character language model
Tuning the LSTM character language model
Summary
15 deep learning on unseen data: introducing federated learning
The problem of privacy in deep learning
Federated learning
Learning to detect spam
Let’s make it federated
Hacking into federated learning
Secure aggregation
Homomorphic encryption
Homomorphically encrypted federated learning
Summary
16 where to go from here: a brief guide
Congratulations!
Step 1: Start learning PyTorch
Step 2: Start another deep learning course
Step 3: Grab a mathy deep learning textbook
Step 4: Start a blog, and teach deep learning
Step 5: Twitter
Step 6: Implement academic papers
Step 7: Acquire access to a GPU (or many)
Step 8: Get paid to practice
Step 9: Join an open source project
Step 10: Develop your local community
About the Technology
Deep learning, a branch of artificial intelligence, teaches computers to learn by using neural networks, technology inspired by the human brain. Online text translation, self-driving cars, personalized product recommendations, and virtual voice assistants are just a few of the exciting modern advancements possible thanks to deep learning.
About the book
Grokking Deep Learning teaches you to build deep learning neural networks from scratch! In his engaging style, seasoned deep learning expert Andrew Trask shows you the science under the hood, so you grok for yourself every detail of training neural networks. Using only Python and its math-supporting library, NumPy, you’ll train your own neural networks to see and understand images, translate text into different languages, and even write like Shakespeare! When you’re done, you’ll be fully prepared to move on to mastering deep learning frameworks.
What's inside
- The science behind deep learning
- Building and training your own neural networks
- Privacy concepts, including federated learning
- Tips for continuing your pursuit of deep learning
About the author
Andrew Trask is a PhD student at Oxford University and a research scientist at DeepMind. Previously, Andrew was a researcher and analytics product manager at Digital Reasoning, where he trained the world’s largest artificial neural network and helped guide the analytics roadmap for the Synthesys cognitive computing platform.
- customers also bought these items
- Think Like a Data Scientist
- Deep Learning and the Game of Go
- Keras in Motion
- Deep Learning for Search
- Real-World Machine Learning
- Deep Learning with Python
placing your order...
Don't refresh or navigate away from the page.FREE domestic shipping on three or more pBooks