Machine Learning Bookcamp
Build a portfolio of real-life projects
Alexey Grigorev
  • MEAP began January 2020
  • Publication in Spring 2021 (estimated)
  • ISBN 9781617296819
  • 475 pages (estimated)
  • printed in black & white

An amazing introduction to learning machine learning by doing projects.

Joseph Perenia
The only way to learn is to practice! In Machine Learning Bookcamp, you’ll create and deploy Python-based machine learning models for a variety of increasingly challenging projects. Taking you from the basics of machine learning to complex applications such as image and text analysis, each new project builds on what you’ve learned in previous chapters. By the end of the bookcamp, you’ll have built a portfolio of business-relevant machine learning projects that hiring managers will be excited to see.

About the Technology

Machine learning is an analysis technique for predicting trends and relationships based on historical data. As ML has matured as a discipline, an established set of algorithms has emerged for tackling a wide range of analysis tasks in business and research. By practicing the most important algorithms and techniques, you can quickly gain a footing in this important area. Luckily, that’s exactly what you’ll be doing in Machine Learning Bookcamp.

About the book

In Machine Learning Bookcamp you’ll learn the essentials of machine learning by completing a carefully designed set of real-world projects. Beginning as a novice, you’ll start with the basic concepts of ML before tackling your first challenge: creating a car price predictor using linear regression algorithms. You’ll then advance through increasingly difficult projects, developing your skills to build a churn prediction application, a flight delay calculator, an image classifier, and more. When you’re done working through these fun and informative projects, you’ll have a comprehensive machine learning skill set you can apply to practical on-the-job problems.
Table of Contents detailed table of contents

1 Introduction to machine learning

1.1 Machine learning

1.1.1 Machine learning vs. rule-based systems

1.1.2 When machine learning isn’t helpful

1.1.3 Supervised machine learning

1.2 Machine learning process

1.2.1 Business understanding step

1.2.2 Data understanding step

1.2.3 Data preparation step

1.2.4 Modeling step

1.2.5 Evaluation step

1.2.6 Deployment step

1.2.7 Iterate

1.3 Modeling and model validation

1.4 Summary

2 Machine learning for regression

2.1 Car-price prediction project

2.1.1 Downloading the dataset

2.2 Exploratory data analysis

2.2.1 Exploratory data analysis toolbox

2.2.2 Reading and preparing data

2.2.3 Target variable analysis

2.2.4 Checking for missing values

2.2.5 Validation framework

2.3 Machine learning for regression

2.3.1 Linear regression

2.3.2 Training linear regression model

2.4 Predicting the price

2.4.1 Baseline solution

2.4.2 RMSE: evaluating model quality

2.4.3 Validating the model

2.4.4 Simple feature engineering

2.4.5 Handling categorical variables

2.4.6 Regularization

2.4.7 Using the model

2.5 Next steps

2.5.1 Exercises

2.5.2 Other projects

2.6 Summary

2.7 Answers to exercises

3 Machine learning for classification

3.1 Churn prediction project

3.1.1 Telco churn dataset

3.1.2 Initial data preparation

3.1.3 Exploratory data analysis

3.1.4 Feature importance

3.2 Feature engineering

3.2.1 One-hot encoding for categorical variables

3.3 Machine learning for classification

3.3.1 Logistic regression

3.3.2 Training logistic regression

3.3.3 Model interpretation

3.3.4 Using the model

3.4 Next steps

3.4.1 Exercises

3.4.2 Other projects

3.5 Summary

3.6 Answers to exercises

4 Evaluation metrics for classification

4.1 Evaluation metrics

4.1.1 Classification accuracy

4.1.2 Dummy baseline

4.2 Confusion table

4.2.1 Introduction to confusion table

4.2.2 Calculating the confusion table with NumPy

4.2.3 Precision and recall

4.3 ROC curve and AUC score

4.3.1 True positive rate and false positive rate

4.3.2 Evaluating a model at multiple thresholds

4.3.3 Random baseline model

4.3.4 The ideal model

4.3.5 ROC Curve

4.3.6 Area under the ROC curve (AUC)

4.4 Parameter tuning

4.4.1 K-fold cross-validation

4.4.2 Finding best parameters

4.5 Next steps

4.5.1 Other projects

4.6 Summary

4.7 Answers to exercises

5 Deploying machine learning models

5.1 Churn prediction model

5.1.1 Using the model

5.1.2 Using Pickle to save and load the model

5.2 Model serving

5.2.1 Web services

5.2.2 Flask

5.2.3 Serving churn model with Flask

5.3 Managing dependencies

5.3.1 Pipenv

5.3.2 Docker

5.4 Deployment

5.4.1 AWS Elastic Beanstalk

5.5 Next steps

5.5.1 Exercises

5.5.2 Other projects

5.6 Summary

6 Decision trees and ensemble learning

6.1 Credit risk scoring project

6.1.1 Credit scoring dataset

6.1.2 Data cleaning

6.1.3 Dataset preparation

6.2 Decision trees

6.2.1 Decision tree classifier

6.2.2 Decision tree learning algorithm

6.2.3 Parameter tuning

6.3 Random forest

6.3.1 Training a random forest

6.3.2 Parameter tuning

6.4 Gradient boosting machines

6.4.1 XGBoost: extreme gradient boosting

6.4.2 Model performance monitoring

6.4.3 Parameter tuning

6.4.4 Testing the final model

6.5 Next steps

6.5.1 Exercises

6.5.2 Other projects

6.6 Summary

7 Neural networks and deep learning

8 Serving deep learning models

9 Working with Texts

10 Getting Training Data

Appendix A: Installing the libraries

A.1 Installing Python and Anaconda

A.1.1 Installing Python and Anaconda on Linux

A.1.2 Installing Python and Anaconda on Windows

A.1.3 Installing Python and Anaconda on macOS

A.2 Running Jupyter

A.2.1 Running Jupyter on Linux

A.2.2 Running Jupyter on Windows

A.2.3 Running Jupyter on MacOS

A.3 Installing the Kaggle CLI

A.4 Accessing the source code

A.5 Renting a server on AWS

A.5.1 Registering on AWS

A.5.2 Accessing billing information

A.5.3 Creating an EC2 instance

A.5.4 Connecting to the instance

A.5.5 Shutting down the instance

A.5.6 Creating an EC2 instance with the AWS CLI

A.6 Summary

Appendix B: Python basics

B.1 Variables

B.1.1 Control-flow

B.1.2 Collections

B.1.3 Code reusability

B.1.4 Installing libraries

B.1.5 Python programs

B.1.6 Summary

Appendix C: NumPy and Linear Algebra

C.1.1 NumPy

C.1.2 NumPy operations

C.1.3 Linear algebra

C.1.4 Summary

Appendix D: Introduction to Pandas

D.1 Pandas

D.1.1 DataFrame

D.1.2 Series

D.1.3 Index

D.1.4 Accessing rows

D.1.5 Splitting DataFrame

D.2 Operations

D.2.1 Element-wise operations

D.2.2 Filtering

D.2.3 String operations

D.2.4 Summarizing operations

D.2.5 Missing values

D.2.6 Sorting

D.2.7 Grouping

D.3 Summary

What's inside

  • Code fundamental ML algorithms from scratch
  • Collect and clean data for training models
  • Use popular Python tools, including NumPy, Pandas, Scikit-Learn, and TensorFlow
  • Apply ML to complex datasets with images and text
  • Deploy ML models to a production-ready environment

About the reader

For readers with existing programming skills. No previous machine learning experience required.

About the author

Alexey Grigorev has more than ten years of experience as a software engineer, and has spent the last six years focused on machine learning. Currently, he works as a lead data scientist at the OLX Group, where he deals with content moderation and image models. He is the author of two other books on using Java for data science and TensorFlow for deep learning.

placing your order...

Don't refresh or navigate away from the page.
Manning Early Access Program (MEAP) Read chapters as they are written, get the finished eBook as soon as it’s ready, and receive the pBook long before it's in bookstores.
print book $29.99 $49.99 pBook + eBook + liveBook
Additional shipping charges may apply
Machine Learning Bookcamp (print book) added to cart
continue shopping
go to cart

eBook $31.99 $39.99 3 formats + liveBook
Machine Learning Bookcamp (eBook) added to cart
continue shopping
go to cart

Prices displayed in rupees will be charged in USD when you check out.
customers also reading

This book 1-hop 2-hops 3-hops

FREE domestic shipping on three or more pBooks