Ensemble Methods for Machine Learning
Gautam Kunapuli
  • MEAP began July 2020
  • Publication in Spring 2021 (estimated)
  • ISBN 9781617297137
  • 350 pages (estimated)
  • printed in black & white

The definitive and complete guide on ensemble learning. A must read!

Al Krinker
Many machine learning problems are too complex to be resolved by a single model or algorithm. Ensemble machine learning trains a group of diverse machine learning models to work together to solve a problem. By aggregating their output, these ensemble models can flexibly deliver rich and accurate results. Ensemble Methods for Machine Learning is a guide to ensemble methods with proven records in data science competitions and real-world applications. Learning from hands-on case studies, you'll develop an under-the-hood understanding of foundational ensemble learning algorithms to deliver accurate, performant models.

About the Technology

Ensemble machine learning lets you make robust predictions without needing the huge datasets and processing power demanded by deep learning. It sets multiple models to work on solving a problem, combining their results for better performance than a single model working alone. This "wisdom of crowds" approach distils information from several models into a set of highly accurate results.

About the book

In Ensemble Methods for Machine Learning you'll learn to implement the most important ensemble machine learning methods from scratch. Each chapter contains a new case study, taking you hands-on with a fully functioning ensemble method for medical diagnosis, sentiment analysis, handwriting classification, and more. There's no complex math or theory—each method is taught in a practical and visuals-first manner. Best of all, all code is provided in Jupyter notebooks for your easy experimentation! By the time you're done, you'll know the benefits, limitations, and practical methods of applying ensemble machine learning to real-world data, and be ready to build more explainable ML systems.
Table of Contents detailed table of contents

Part 1: The Basics of Ensembles

1 Ensemble Methods: Hype or Hallelujah?

1.1 Ensemble Methods: The Wisdom of the Crowds

1.2 Why You Should Care About Ensemble Learning

1.3 Fit vs. Complexity in Individual Models

1.3.1 Regression with Decision Trees

1.3.2 Regression with support vector machines

1.4 Our First Ensemble

1.5 Summary

Part 2: Ensemble Methods

2 Homogeneous Parallel Ensembles: Bagging and Random Forests

2.1 Parallel Ensembles

2.2 Bagging: Bootstrap Aggregating

2.2.1 Intuition: Resampling and Model Aggregation

2.2.2 Implementing Bagging

2.2.3 Bagging with scikit-learn

2.2.4 Faster Training with Parallelization

2.3 Random Forests

2.3.1 Randomized Decision Trees

2.3.2 Random Forests with scikit-learn

2.3.3 Feature Importances

2.4 More Homogeneous Parallel Ensembles

2.4.1 Pasting

2.4.2 Random Subspaces and Random Patches

2.4.3 ExtraTrees

2.5 Case Study: Breast Cancer Diagnosis

2.5.1 Loading and pre-processing

2.5.2 Bagging, Random Forests and ExtraTrees

2.5.3 Feature importances with Random Forests

2.6 Summary

3 Heterogeneous Parallel Ensembles: Combining Strong Learners

3.1 Base estimators for heterogeneous ensembles

3.1.1 Fitting base estimators

3.1.2 Individual predictions of base estimators

3.2 Combining predictions by weighting

3.2.1 Majority Vote

3.2.2 Accuracy weighting

3.2.3 Entropy weighting

3.2.4 Dempster-Shafer Combination

3.3 Combining predictions by meta-learning

3.3.1 Stacking

3.3.2 Stacking with cross validation

3.4 Case Study: Sentiment Analysis

3.4.1 Pre-processing

3.4.2 Dimensionality Reduction

3.4.3 Stacking classifiers

3.5 Summary

4 Sequential Ensembles: Boosting

4.1 Sequential Ensembles of Weak Learners

4.2 AdaBoost: ADAptive BOOSTing

4.2.1 Intuition: Learning with Weighted Examples

4.2.2 Implementing AdaBoost

4.2.3 AdaBoost with scikit-learn

4.3 AdaBoost in Practice

4.3.1 Learning Rate

4.3.2 Early Stopping and Pruning

4.4 Case Study: Handwritten Digit Classification

4.4.1 Dimensionality Reduction with t-SNE

4.4.2 Boosting

4.5 LogitBoost: Boosting with the Logistic Loss

4.6 Summary

5 Sequential Ensembles: Gradient Boosting

6 Sequential Ensembles: Newton Boosting

Part 3: Beyond Classification

7 Ensembles for Regression

8 Ensembles for Clustering

9 Ensemble Diversity

Part 4: Advanced Ensemble Learning

10 Interpretability and Explainability

11 Human-in-the-Loop Ensembles

What's inside

  • Bagging, boosting, and gradient boosting
  • Methods for classification, regression, clustering, and recommendations
  • Sophisticated off-the-shelf ensemble implementations
  • Feature engineering and ensemble diversity
  • Interpretability and explainability for ensemble methods
  • Human-in-the-loop methods

About the reader

For Python programmers with machine learning experience.

About the author

Gautam Kunapuli has over 15 years of experience in academia and the machine learning industry. He has developed several novel algorithms for diverse application domains including social network analysis, text and natural language processing, behavior mining, educational data mining and biomedical applications. He has also published papers exploring ensemble methods in relational domains and with imbalanced data.

placing your order...

Don't refresh or navigate away from the page.
Manning Early Access Program (MEAP) Read chapters as they are written, get the finished eBook as soon as it’s ready, and receive the pBook long before it's in bookstores.
print book $29.99 $59.99 pBook + eBook + liveBook
Additional shipping charges may apply
Ensemble Methods for Machine Learning (print book) added to cart
continue shopping
go to cart

eBook $24.99 $47.99 3 formats + liveBook
Ensemble Methods for Machine Learning (eBook) added to cart
continue shopping
go to cart

Prices displayed in rupees will be charged in USD when you check out.

FREE domestic shipping on three or more pBooks