Fighting Churn with Data
The science and strategy of customer retention
Carl S. Gold
  • MEAP began June 2019
  • Publication in November 2020 (estimated)
  • ISBN 9781617296529
  • 375 pages (estimated)
  • printed in black & white

This book is priceless source of hands-on gained insights in analyzing and dealing with churn-related problems. The case studies are pure gold.

Milorad Imbra
The beating heart of any product or service business is returning clients. Don't let your hard-won customers vanish, taking their money with them. In Fighting Churn with Data you'll learn powerful data-driven techniques to maximize customer retention and minimize actions that cause them to stop engaging or unsubscribe altogether. This hands-on guide is packed with techniques for converting raw data into measurable metrics, testing hypotheses, and presenting findings that are easily understandable to non-technical decision makers.

About the Technology

Churn is the bane of any business that relies on recurring revenue, such as content subscriptions, software as a service, and retail sales. You can improve customer retention through product changes and targeted engagement campaigns based on data-driven interventions. Data scientists and business analysts employ huge datasets of user behavior to determine why customers leave and implement processes to stop them from doing so.

About the book

Fighting Churn with Data is your guide to keeping your customers for the long haul. Chief Data Scientist at Zuora Carl S. Gold provides a clear overview of churn concepts, along with hands on tricks and tips he has developed through years of experience analyzing customer behavior. Packed with project-based examples, this book teaches you to convert raw data into measurable customer metrics, develop and test hypotheses about churn rates, and present your findings clearly to non-technical decision makers in marketing and sales. Using this book, anyone with a modest data analysis background can get churn analysis right and reap the revenue benefits of high customer retention.
Table of Contents detailed table of contents

Part 1: Building an Arsenal

1 The world of churn

1.1 Why You are Reading this book

1.1.1 The Typical Churn Scenario

1.1.2 What this book is about

1.2 Fighting Churn

1.2.1 Interventions that reduce churn

1.2.2 Why churn is hard to fight

1.2.3 Great customer metrics: weapons in the fight against churn

1.3 Why this book is different

1.3.1 Practical and in depth

1.3.2 Simulated case study

1.4 Products with recurring user interactions

1.4.1 Paid consumer products

1.4.2 Business-to-business services

1.4.3 Ad-supported media and apps

1.4.4 Consumer feed subscriptions

1.4.5 Freemium business models

1.4.6 In-app purchase models

1.5 Non-subscription churn scenarios

1.5.1 Inactivity as churn

1.5.2 Free trial conversion

1.5.3 Upsell/down sell

1.5.4 Other Yes/no (binary) customer predictions

1.5.5 Customer activity predictions

1.5.6 Use cases that are not like churn

1.6 Customer behavior data

1.6.1 Customer events in common product categories

1.6.2 The most important events

1.7 Case studies in fighting churn

1.7.1 Klipfolio, Inc.

1.7.2 Broadly, Inc.

1.7.3 Versature

1.7.4 Social network simulation

1.8 Case studies in great customer metrics

1.8.1 Utilization

1.8.2 Success Rates

1.8.3 Unit cost

1.9 Summary

2 Measuring churn

2.1 Definition of the churn rate

2.1.1 Calculating the churn rate and retention rate

2.1.2 The relationship between churn rate and retention rate

2.2 Subscription databases

2.3 Basic churn calculation: Net retention

2.3.1 Net retention calculation

2.3.2 SQL net retention calculation

2.3.3 Interpreting net retention

2.4 Standard account-based churn

2.4.1 Standard churn rate definition

2.4.2 Outer joins for churn calculation

2.4.3 Standard churn calculation with SQL

2.4.4 When to use the standard churn rate

2.5 Activity (event-based) churn for non-subscription products

2.5.1 Defining an active account and churn from events

2.5.2 Activity churn calculations with SQL

2.6 Advanced churn: Monthly recurring revenue (MRR) churn

2.6.1 MRR churn definition and calculation

2.6.2 MRR churn calculation with SQL

2.6.3 MRR churn vs. account churn vs. net (retention) churn

2.7 Churn rate measurement conversion

2.7.1 Survivor analysis (advanced)

2.7.2 Churn rate conversions

2.7.3 Converting any churn measurement window in SQL

2.7.4 Picking the churn measurement window

2.7.5 Seasonality and churn rates

2.8 Summary

3 Measuring customers

3.1 From events to metrics

3.2 Event data warehouse schema

3.3 Counting events in one time period

3.4 Details of metric period definitions

3.4.1 Weekly behavioral cycles

3.4.2 Timestamps for metric measurements

3.5 Making measurements at different points in time

3.5.1 Overlapping measurement windows

3.5.2 Timing metric measurements

3.5.3 Saving metric measurements

3.5.4 Saving metrics for the simulation examples

3.6 Measuring totals and averages of event properties

3.7 Metric quality assurance

3.7.1 Testing how metrics change over time

3.7.2 Metric quality assurance (QA) case studies

3.7.3 Checking how many accounts receive metrics

3.8 Event QA

3.8.1 Checking how events change over time

3.8.2 Checking events per account

3.9 Selecting the measurement period for behavioral measurements

3.10 Measuring account tenure

3.10.1 Account tenure definition

3.10.2 Recursive table expressions for account tenure

3.10.3 Account tenure SQL program

3.11 Measuring MRR and other subscription metrics

3.11.1 Calculating MRR as a metric

3.11.2 Subscriptions for specific amounts

3.11.3 Calculating subscription unit quantities as metrics

3.11.4 Calculating the billing period as a metric

3.12 Summary

4 Observing renewal and churn

4.1 Introduction to datasets

4.2 How to observe customers

4.2.1 Observation lead time

4.2.2 Observing sequences of renewals and a churn

4.2.3 Overview of creating a dataset from subscriptions

4.3 Identifying active periods from subscriptions

4.3.1 Active periods

4.3.2 Schema for storing active periods

4.3.3 Finding active periods that are ongoing

4.3.4 Finding active periods ending in churn

4.4 Identifying active periods for non-subscription products

4.4.1 Active period definition

4.4.2 Process for forming datasets from events

4.4.3 SQL for calculating active weeks

4.5 Picking observation dates

4.5.1 Balancing churn and non-churn observations

4.5.2 Observation date-picking algorithm

4.5.3 Observation date SQL program

4.6 Exporting a churn dataset

4.6.1 Dataset creation SQL program

4.7 Exporting the current customers for segmentation

4.7.1 Selecting active accounts and metrics

4.7.2 Segmenting customers by their metrics

4.8 Summary

Part 2: Waging the War

5 Understanding churn and behavior with metrics

5.1 Metric cohort analysis

5.1.1 The idea behind cohort analysis

5.1.2 Cohort analysis with Python

5.1.3 Cohorts of product use

5.1.4 Cohorts of account tenure

5.1.5 Cohort analysis of billing period

5.1.6 Minimum cohort size

5.1.7 Significant and insignificant cohort differences

5.1.8 Metric cohorts with a majority of zero customer metrics

5.1.9 Causality: Are the metrics causing churn?

5.2 Summarizing customer behavior

5.2.1 Understanding the distribution of the metrics

5.2.2 Calculating dataset summary statistics in Python

5.2.3 Screening rare metrics

5.2.4 Involving the business in data quality assurance

5.3 Scoring metrics

5.3.1 The idea behind metric scores

5.3.2 The metric score algorithm

5.3.3 Calculating metric scores in Python

5.3.4 Cohort analysis with scored metrics

5.3.5 Cohort analysis of monthly recurring revenue

5.4 Removing unwanted or invalid observations

5.4.1 Removing non-paying customers from churn analysis

5.4.2 Removing observations based on metric thresholds in Python

5.4.3 Removing zero measurements from rare metric analyses

5.4.4 Disengaging behaviors: Metrics associated with increasing churn

5.5 Segmenting customers using cohort analysis

5.5.1 Segmenting process

5.5.2 Choosing segment criteria

5.6 Summary

6 Relationships between customer behaviors

6.1 Correlation between behaviors

6.1.1 Correlation between pairs of metrics

6.1.2 Investigating correlations with Python

6.1.3 Understanding correlations between sets of metrics with correlation matrices

6.1.4 Case study correlation matrices

6.1.5 Calculating correlation matrices in Python

6.2 Averaging groups of behavioral metrics

6.2.1 Why you average correlated metric scores

6.2.2 Averaging scores with a matrix of weights (loading matrix)

6.2.3 Case study for loading matrices

6.2.4 Applying a loading matrix in Python

6.2.5 Churn cohort analysis on metric group average scores

6.3 Discovering groups of correlated metrics

6.3.1 Grouping metrics by clustering correlations

6.3.2 Clustering correlations in Python

6.3.3 Loading matrix weights that make the average of scores a score

6.3.4 Running the metric grouping and grouped cohort analysis listings

6.3.5 Picking the correlation threshold for clustering

6.4 Explaining correlated metric groups to the business

6.5 Summary

7 Segmenting customers with advanced metrics

7.1 Ratio metrics

7.1.1 When to use ratio metrics and why

7.1.2 How to calculate ratio metrics

7.1.3 Ratio metric case study examples

7.1.4 Additional ratio metrics for the simulated social network

7.2 Percent of total metrics

7.2.1 Calculating percentage of total metrics

7.2.2 Percentage of total metric case study with two metrics

7.2.3 Percentage of total metric case study with multiple metrics

7.3 Metrics that measure change

7.3.1 Measuring change in the level of activity

7.3.2 Scores for metrics with extreme outliers (fat tails)

7.3.3 Measuring the time since the last activity

7.4 Scaling metric time periods

7.4.1 Scaling longer metrics to shorter quoting periods

7.4.2 Estimating metrics for new accounts

7.5 User metrics

7.5.1 Measuring active users

7.5.2 Active user metrics

7.6 Which ratios to use

7.6.1 Why use ratios and what else is there?

7.6.2 Which ratios to use

7.7 Summary

Part 3: Special Weapons and Tactics

8 Forecasting churn

8.1 Forecasting churn with a model

8.1.1 Probability forecasts with a model

8.1.2 Engagement and retention probability

8.1.3 Engagement is derived from customer behavior

8.1.4 An offset matches observed churn rates to the S-curve

8.1.5 The logistic-regression probability calculation

8.2 Review of data preparation

8.3 Fitting a churn model

8.3.1 Results of logistic regression

8.3.2 Logistic regression code

8.3.3 Explaining logistic regression results

8.3.4 Logistic regression case study

8.3.5 Calibration and historical churn probabilities

8.4 Forecasting churn probabilities

8.4.1 Preparing the current customer dataset for forecasting

8.4.2 Preparing the current customer data for segmenting

8.4.3 Forecasting with a saved model

8.4.4 Forecasting case studies

8.4.5 Forecast calibration and forecast drift

8.5 Pitfalls of churn forecasting

8.5.1 Correlated metrics

8.5.2 Outliers

8.6 Customer lifetime value

9 Forecast accuracy and machine learning

9.1 Measuring the accuracy of churn forecasts

9.1.1 Why you don’t use the standard accuracy measurement for churn

9.1.2 Measuring churn forecast accuracy with the AUC

9.1.3 Measuring churn forecast accuracy with the lift

9.2 Historical accuracy simulation: backtesting

9.2.1 What and why of backtesting

9.2.2 Backtesting code

9.2.3 Backtesting considerations and pitfalls

9.3 The regression control parameter

9.3.1 Controlling the strength and number of regression weights

9.3.2 Regression with the control parameter

9.4 Picking the regression parameter by testing (cross-validation)

9.4.1 Cross-validation

9.4.2 Cross-validation code

9.4.3 Regression cross-validation case studies

9.5 Forecasting churn risk with machine learning

9.5.1 The XGBoost learning models

9.5.2 XGBoost cross-validation

9.5.3 Comparison of XGBoost accuracy to regression

9.5.4 Comparison of advanced and basic metrics

9.6 Segmenting customers with machine learning forecasts

9.7 Summary

10 Churn Demographics and Firmographics

10.1 Demo and firmographic datasets

10.1.1 Types of demographic and firmographic data

10.1.2 Account data model for the social network simulation

10.1.3 Demographic dataset SQL

10.2 Churn cohorts with demographic and firmographic categories

10.2.1 Churn rate cohorts for demographic categories

10.2.2 Churn rate confidence intervals

10.2.3 Comparing demographic cohorts with confidence intervals

10.3 Grouping demographic categories

10.3.1 Representing groups with a mapping dictionary

10.3.2 Cohort analysis with grouped categories

10.3.3 Designing category groups

10.4 Churn analysis for date- and numeric-based demographics

10.5 Churn forecasting with demographic data

10.5.1 Converting text fields to dummy variables

10.5.2 Forecasting churn with categorical dummy variables alone

10.5.3 Combining dummy variables with numeric data

10.5.4 Forecasting churn with demographic and metrics combined

10.6 Segmenting current customers with demographic data

10.7 Summary

11 Leading the fight against churn

11.1 Planning your own fight against churn?

11.1.1 Picking churn reduction strategies

11.1.2 Data processing and analysis checklist

11.1.3 Communication to the business checklist

11.2 Running the book listings on your own data

11.2.1 Loading your data into this book’s data schema

11.2.2 Running the listings on your own data

11.3 Porting this book’s listings to different environments

11.3.1 Porting the SQL listings

11.3.2 Porting the Python listings

11.4 Learning more and keeping in touch

11.4.1 Author’s blog site and social media

11.4.2 Sources for churn benchmark information

11.4.3 Other sources for information about churn?

11.4.4 Products that help with churn

11.5 Summary

What's inside

  • Calculate churn metrics from a subscription database
  • Spot the user behavior that is most predictive of churn
  • Master churn reduction tactics with customer segmentation
  • Apply churn analysis techniques to other business areas
  • Communicate data-based findings to non-technical stakeholders

About the reader

For readers with basic data analysis skills, including Python and SQL.

About the author

Carl Gold is the Chief Data Scientist at Zuora, Inc, a comprehensive subscription management platform and newly public Silicon Valley "unicorn". Zuora is widely recognized as a leader in all things pertaining to subscription and recurring revenue, with 1,000 customers across a range of industries worldwide. Carl joined Zuora in 2015 and created the predictive analytics system for Zuora’s subscriber analysis product, Zuora Insights.

placing your order...

Don't refresh or navigate away from the page.
Manning Early Access Program (MEAP) Read chapters as they are written, get the finished eBook as soon as it’s ready, and receive the pBook long before it's in bookstores.
print book $29.99 $59.99 pBook + eBook + liveBook
Additional shipping charges may apply
Fighting Churn with Data (print book) added to cart
continue shopping
go to cart

eBook $24.99 $47.99 3 formats + liveBook
Fighting Churn with Data (eBook) added to cart
continue shopping
go to cart

Prices displayed in rupees will be charged in USD when you check out.

FREE domestic shipping on three or more pBooks