Human-in-the-Loop Machine Learning
Robert Munro
  • MEAP began July 2019
  • Publication in Spring 2020 (estimated)
  • ISBN 9781617296741
  • 325 pages (estimated)
  • printed in black & white

I've learned a lot of new things about Machine Learning I would never even have considered before.

Michiel Trimpe
How humans and machines should work together to solve problems is one of the most important questions in technology. However, in machine learning, the accuracy of innovative algorithms often end up with the most attention. But to build the most accurate model quickly you also need clean, relevant, correctly-labeled data for your system to train on. Human-in-the-Loop Machine Learning is a practical guide to optimizing the entire machine learning process, including techniques for annotation, sampling, and even using ML systems to help automate the process.
Table of Contents detailed table of contents

Part 1: First Steps

1 Introduction to Human-in-the-Loop Machine Learning

1.1 The Basic Principles of Human-in-the-Loop Machine Learning

1.2 Introducing Annotation

1.2.1 Simple and more complicate annotation strategies

1.2.2 Plugging the gap in data science knowledge

1.2.3 Quality human annotations: why is it hard?

1.3 Introducing Active Learning: improving the speed & cost of training data

1.3.1 Three broad Active Learning sampling strategies: uncertainty, diversity, and random

1.3.2 What is a random selection of evaluation data?

1.3.3 When to use Active Learning?

1.4 Machine Learning and Human-Computer Interaction

1.4.1 User interfaces: how do you create training data?

1.4.2 Priming: what can influence human perception?

1.4.3 The pros and cons of creating labels by evaluating Machine Learning predictions

1.4.4 Basic principles for designing annotation interfaces

1.5 Machine Learning-Assisted Humans vs Human-Assisted Machine Learning

1.6 Transfer learning to kick-start your models

1.6.1 Transfer Learning in Computer Vision

1.6.2 Transfer Learning in Natural Language Processing

1.7 Summary

2 Getting Started with Human-in-the-Loop Machine Learning

2.1 Beyond “Hack-tive Learning:” your first Active Learning algorithm

2.1.1 The architecture of your first HuML system

2.2 Interpreting model predictions and data to support Active Learning

2.2.1 Confidence ranking

2.2.2 Identifying outliers

2.2.3 What to expect as you iterate

2.3 Building an interface to get human labels

2.3.1 A simple interface for labeling text

2.3.2 Managing Machine Learning data

2.4 Deploying your first Human-in-the-Loop Machine Learning system

2.4.1 Always get your evaluation data first!

2.4.2 Every data point gets a chance

2.4.3 Select the right strategies for your data

2.4.4 Retrain the model and iterate

2.5 Summary

Part 2: Active Learning Strategies

3 Uncertainty Sampling

3.1 Interpreting Uncertainty in a Machine Learning Model

3.1.1 Why look for uncertainty in your model?

3.1.2 Interpreting the scores from your model

3.1.3 “Score”, “Confidence”, and “Probability”: Do not trust the name!

3.1.4 SoftMax: converting the model output into confidences

3.2 Algorithms for Uncertainty Sampling

3.2.1 Least Confidence sampling

3.2.2 Margin of Confidence sampling

3.2.3 Entropy (classification entropy)

3.3 Identifying when different types of models are confused

3.3.1 What is the best activation function for Active Learning?

3.3.2 Uncertainty sampling with Logistic Regression and MaxEnt models

3.3.3 Uncertainty sampling with Support Vector Machines

3.3.4 Uncertainty sampling with Bayesian Models

3.3.5 Uncertainty sampling with Decision Trees & Random Forests

3.3.6 Uncertainty sampling with Ensemble models

3.4 Selecting the right number of items for human-review

3.4.1 Budget-constrained uncertainty sampling

3.4.2 Time-constrained uncertainty sampling

3.4.3 When do I stop if I’m not time or budget constrained?

3.5 Evaluating the success of uncertainty sampling

3.5.1 Do I need new test data?

3.5.2 Do I need new validation data?

3.6 Summary

4 Sampling for Diversity & Outliers

5 Other Active-Learning Sampling Strategies

Part 3: Annotating Data for Machine Learning

6 Who are the right people to annotate your data?

7 Quality control for data annotation

8 User interfaces for data annotation

Part 4 Transfer Learning and Pre-Trained Models

9 What are Embeddings?

10 What is Transfer Learning?

Part 5 Adaptive Learning: putting it all together

11 Machine-Learning for aiding human annotation

12 Advanced Human-in-the-Loop Machine Learning

About the Technology

“Human-in-the-Loop machine learning” refers to the need for human interaction with machine learning systems to improve human performance, machine performance, or both. Most machine learning projects do not have the time or budget for human input on every data point, and so need strategies for deciding which data points are the most important for human review. Ongoing human involvement with the right interfaces expedites the efficient labeling of tricky or novel data that a machine can’t process, reducing the potential for data-related errors.

About the book

Human-in-the-Loop Machine Learning is a guide to optimizing the human and machine parts of your machine learning systems, to ensure that your data and models are correct, relevant, and cost-effective. 20-year machine learning veteran Robert Munro lays out strategies to get machines and humans working together efficiently, including building reliable user interfaces for data annotation, Active Learning strategies to sample for human feedback, and Transfer Learning. By the time you’re done, you’ll be able to design machine learning systems that automatically select the right data for humans to review and ensure that those annotations are accurate and useful.

What's inside

  • Active Learning to sample the right data for humans to annotate
  • Annotation strategies to provide the optimal interface for human feedback
  • Techniques to select the right people to annotate data and ensure quality control
  • Supervised machine learning design and query strategies to support Human-in-the-Loop systems
  • Advanced Adaptive Learning approaches that use machine learning to optimize each step in the Human-in-the-Loop process
  • Real-world use cases from well-known data scientists

About the author

Robert Munro has built Annotation, Active Learning, and machine learning systems with machine learning-focused startups and with larger companies including Amazon, Google, IBM, and most major phone manufacturers. If you speak to your phone, if your car parks itself, if your music is tailored to your taste, or if your news articles are recommended for you, then there is a good chance that Robert contributed to this experience.

Robert holds a PhD from Stanford focused on Human-in-the-Loop machine learning for healthcare and disaster response, and is a disaster response professional in addition to being a machine learning professional. A worked example throughout this text is classifying disaster-related messages from real disasters that Robert has helped respond to in the past.

Manning Early Access Program (MEAP) Read chapters as they are written, get the finished eBook as soon as it’s ready, and receive the pBook long before it's in bookstores.
MEAP combo $59.99 pBook + eBook + liveBook
MEAP eBook $47.99 pdf + ePub + kindle + liveBook

placing your order...

Don't refresh or navigate away from the page.

FREE domestic shipping on three or more pBooks