Grokking Machine Learning
Luis G. Serrano
  • MEAP began May 2019
  • Publication in Spring 2020 (estimated)
  • ISBN 9781617295911
  • 350 pages (estimated)
  • printed in black & white

Written in an approachable manner with great use of very illustrative and applicable examples.

Borko Djurkovic
It's time to dispel the myth that machine learning is difficult. Grokking Machine Learning teaches you how to apply ML to your projects using only standard Python code and high school-level math. No specialist knowledge is required to tackle the hands-on exercises using readily-available machine learning tools!
Table of Contents detailed table of contents

1 What is machine learning?

1.1 Why this book?

1.2 Is machine learning hard?

1.3 But what exactly is machine learning?

1.3.1 What is the difference between artificial intelligence and machine learning?

1.3.2 What about deep learning?

1.4 Humans use the remember-formulate-predict framework to make decisions (and so can machines!)

1.4.1 How do humans think?

1.4.2 How do machines think?

1.5 What is this book about?

1.6 Summary

2 Types of machine learning

2.1 What is the difference between labelled and unlabelled data?

2.2 What is supervised learning?

2.2.1 Regression models predict numbers

2.2.2 Classification models predict a state

2.3 What is unsupervised learning?

2.3.1 Clustering algorithms split a dataset into similar groups

2.3.2 Dimensionality reduction simplifies data without losing much information

2.3.3 Matrix factorization and other types of unsupervised learning

2.4. What is reinforcement learning?

2.5 Summary

3 Drawing a line close to our points: Linear regression

3.1 The problem: We need to predict the price of a house

3.2 The solution: Building a regression model for housing prices

3.2.1 The remember step: looking at the prices of existing houses

3.2.2 The formulate step: formulating a rule that estimates the price of the house

3.2.3 The predict step: what do we do when a new house comes in the market?

3.2.4 Some questions that arise and some quick answers

3.3 How to get the computer to draw this line: the linear regression algorithm

3.3.1 Crash course on slope and y-intercept

3.3.2 A simple trick to move a line closer to a set of points, one point at a time.

3.3.2 The square trick: A much more clever way of moving our line closer to one of the points

3.3.3 The linear regression algorithm: Repeating the square trick many times

3.3.4 Plotting dots and lines

3.3.5 Using the linear regression algorithm in our dataset

3.4 Applications

3.4.1 Applications of linear regression

3.5 Summary

4 Using lines to split our points: The perceptron algorithm

4.1 The problem: We have too much spam email and need the computer to sort it

4.2 The solution: Building a spam classifier

4.2.1 The remember step: looking at our data

4.2.2 The formulate step: formulating a rule that guesses if an email is spam or ham

4.2.3 The predict step: what to do when a new email comes in?

4.2.4 Some questions that arise and some quick answers

4.2.5 Plotting our data and our classifier

4.4 How to get the computer to build this classifier: the perceptron algorithm

4.4.1 The perceptron trick: Improving our classifier one step at a time

4.4.2 Coding the perceptron trick

4.4.2 The perceptron algorithm: Repeating the perceptron trick many times

4.4.3 Using the perceptron algorithm on our dataset

4.5 Applications

4.5.1 More email spam classification - using other features aside from words

4.5.2 Sentiment analysis: analyzing if sentences are happy or sad

4.5.3 Image recognition: getting the computer to see and categorize images

4.6 Summary

5 Using probability to its maximum The naive Bayes algorithm

5.1 Sick or healthy? A story with Bayes Theorem

5.1.2 Prelude to Bayes Theorem: The prior, the event, and the posterior

5.2 Use-case: Spam detection model

5.2.1 Finding the prior: The probability that any email is spam

5.2.2 Finding the posterior: The probability that an email is spam knowing that it contains a particular word

5.2.3 What the math just happened? Turning ratios into probabilities

5.2.3 What about two words? The naive Bayes algorithm

5.2.4 What about more than two words?

5.3. Building a spam detection model with real data

5.3.1 Data preprocessing

5.3.2 Finding the priors

5.3.3 Finding the posteriors with Bayes theorem

5.3.4 Implementing the naive Bayes algorithm

5.3.5 Further work

5.4 Summary

6 Splitting data by asking questions Decision trees

6.1 The problem: We need to recommend apps to users according to what they are likely to download

6.2 The solution: Building an app recommendation system

6.2.1 The remember-formulate-predict framework

6.2.2 First step to build the model: Asking the best question

6.2.3 Next and final step: Iterate by asking the best question every time

6.2.3 Using the model by making predictions

6.3 Building the tree: How to pick the right feature to split

6.3.1 How to pick the best feature to split our data: Accuracy

6.3.2 How to pick the best feature to split our data: Gini impurity

6.4 Back to recommending apps: Building our decision tree using Gini index

6.6 Beyond questions like yes/no

6.5.1 Features with more categories, such as Dog/Cat/Bird

6.5.2 Continuous features, such as a number

6.6 Coding a decision tree with sklearn

6.7 A slightly larger example: Spam detection again!

6.8 Applications

6.9.1 Decision trees are widely used in health care

6.9.2 Decision trees are useful in recommendation systems

6.11 Summary

7 A continuous approach to splitting points: logistic regression

8 Combining building blocks to gain more power: neural networks

9 Finding the best line separation: support vector machines

10 Combining our models to maximize results: bagging and boosting

11 Putting it all together: evaluating and improving models


Appendix A: The math behind the algorithms

About the Technology

Machine learning is a collection of mathematically-based techniques and algorithms that enable computers to identify patterns and generate predictions from data. This revolutionary data analysis approach is behind everything from recommendation systems to self-driving cars, and is transforming industries from finance to art. Whatever your field, knowledge of machine learning is becoming an essential skill. Python, along with its libraries like NumPy, Pandas, and scikit-learn, has become the go-to language for machine learning.

About the book

In Grokking Machine Learning, expert machine learning engineer Luis Serrano introduces the most valuable ML techniques and teaches you how to make them work for you. You’ll only need high school math to dive into popular approaches and algorithms. Practical examples illustrate each new concept to ensure you’re grokking as you go. You’ll build models for spam detection, language analysis, and image recognition as you lock in each carefully-selected skill. Packed with easy-to-follow Python-based exercises and mini-projects, this book sets you on the path to becoming a machine learning expert. When you’re done, you’ll have an intuitive understanding of the right approach for any machine learning task or project.

What's inside

  • Different types of machine learning, including supervised and unsupervised learning
  • Algorithms for simplifying, classifying, and splitting data
  • Machine learning packages and tools
  • Hands-on exercises with fully-explained Python code samples

About the reader

For readers with intermediate programming knowledge in Python or a similar language. No machine learning experience or advanced math skills necessary.

About the author

Luis G. Serrano has worked as the Head of Content for Artificial Intelligence at Udacity and as a Machine Learning Engineer at Google, where he worked on the YouTube recommendations system. He holds a PhD in mathematics from the University of Michigan, a Bachelor and Masters from the University of Waterloo, and worked as a postdoctoral researcher at the University of Quebec at Montreal. He shares his machine learning expertise on a YouTube channel with over 2 million views and 35 thousand subscribers, and is a frequent speaker at artificial intelligence and data science conferences.

Manning Early Access Program (MEAP) Read chapters as they are written, get the finished eBook as soon as it’s ready, and receive the pBook long before it's in bookstores.
MEAP combo $49.99 pBook + eBook + liveBook
MEAP eBook $39.99 pdf + ePub + kindle + liveBook
Prices displayed in rupees will be charged in USD when you check out.

placing your order...

Don't refresh or navigate away from the page.

FREE domestic shipping on three or more pBooks