To current and future readers of Grokking Deep Learning,
After nearly three years of effort, I’ve delivered the final chapters of Grokking Deep Learning. It's been a monumental challenge, and I'm very grateful for your patience. I like to tell you about why I think this is fantastic book. First, though, let me tell you why it took so long to write it.
Grokking Deep Learning is just over 300 pages long. To get to those 300 pages, though, I wrote at least twice that number. Half-a-dozen chapters were re-written from scratch three or four times times before they were ready to publish, and along the way we added some important chapters that weren’t in the original Table of Contents.
More significantly, we arrived at two expensive decisions early on that make Grokking Deep Learning uniquely valuable: This book requires no math background beyond basic arithmetic, and it doesn’t rely on a high-level library that might hide what’s going on. In other words, anyone can read this book, and understand how deep learning really works. To accomplish this, we had to invent new ways to describe and teach the core ideas and techniques without falling back on advanced mathematics or sophisticated code that someone else wrote.
My goal in writing Grokking Deep Learning was to create the lowest possible barrier to entry to the practice of Deep Learning. You won’t just read the theory, you’ll discover it yourself. To help you get there, I had to write a lot of code, and the book had to explain it all in the right order so that the code snippets required for the working demos all made sense.
Now, let me tell you about three changes we’re especially proud of:
- We originally included a chapter on Question Answering. However very few products actually need QA technology, and the topic doesn’t get to the timeless fundamentals we want to cover in the book. In the place of the QA chapter, I doubled the page count on recurrent neural networks. You get one chapter on RNNs and another on the incredibly important topic of LSTMs. Recurrent neural networks are the state-of-the-art approach in nearly every sequence-modeling field I can think of, and they're also one of the most popular tools you will be using in industry. This was an easy choice.
- Also, I've added a chapter focusing on privacy. Deep Learning is fundamentally constrained by the availability of training data, which been centralized within large organizations. It’s my personal view that the availability of training data will drastically change in the coming decade, and it's going to transform how Deep Learning advances. This chapter introduces a few basic privacy concepts including Federated Learning, Homomorphic Encryption, and concepts related to Differential Privacy and Secure Multi-Party Computation.
- Finally, the book originally included a chapter on Policy Gradients aka “Reinforcement Learning.” This is an important and popular topic, but it’s a field of its own to which Deep Learning has been applied. We decided that covering RL lightly wouldn’t add much value and made the tough choice to remove that chapter.
Instead, I wound up creating possibly the most valuable piece I've ever done on the subject of Deep Learning, and easily the most valuable chapter in the whole book-- building a deep learning framework from scratch.
In the real world, you will spend 5% of your time coming up with a "cool new idea" to tackle a problem and 95% of your time wrestling with a framework (PyTorch, Keras, Tensorflow, etc.), trying to bring your idea to life. It is my great hope that the new Chapter 13 will fast-track you to becoming a power-user of Deep Learning frameworks by having a mental model of what actually happens within them.
This knowledge, combined with all the theory, code, and examples you explore in this book, will make you much faster at iterating through experiments. You'll have quick successes, better job opportunities, and you’ll even learn about more advanced Deep Learning concepts more rapidly.
I sincerely hope you enjoy Grokking Deep Learning!
Andrew Trask
August 31, 2018