Practical Recommender Systems
Kim Falk
  • January 2019
  • ISBN 9781617292705
  • 432 pages
  • printed in black & white

Covers the technical background and demonstrates implementations in clear and concise Python code.

Andrew Collier, Exegetic

Online recommender systems help users find movies, jobs, restaurants—even romance! There’s an art in combining statistics, demographics, and query terms to achieve results that will delight them. Learn to build a recommender system the right way: it can make or break your application!

About the Technology

Recommender systems are everywhere, helping you find everything from movies to jobs, restaurants to hospitals, even romance. Using behavioral and demographic data, these systems make predictions about what users will be most interested in at a particular time, resulting in high-quality, ordered, personalized suggestions. Recommender systems are practically a necessity for keeping your site content current, useful, and interesting to your visitors.

About the book

Practical Recommender Systems explains how recommender systems work and shows how to create and apply them for your site. After covering the basics, you’ll see how to collect user data and produce personalized recommendations. You’ll learn how to use the most popular recommendation algorithms and see examples of them in action on sites like Amazon and Netflix. Finally, the book covers scaling problems and other issues you’ll encounter as your site grows.

Table of Contents detailed table of contents

1 What is a Recommender?

1.1 Real-life recommendations

1.1.1 Recommender systems are at home on the internet

1.1.2 The long tail

1.1.3 The Netflix recommender system

1.1.4 Recommender system definition

1.2 Taxonomy of recommender systems

1.2.1 Domain

1.2.2 Purpose

1.2.3 Context

1.2.4 Personalization level

1.2.5 Whose opinions

1.2.6 Privacy and trustworthiness

1.2.7 Interface

1.2.8 Algorithms

1.3 Machine learning and the Netflix Prize

1.4 The Movie GEEKs website

1.4.1 Design and specification

1.4.2 Architecture

1.5 Building a recommender system


Part 1: Getting ready for recommender systems

2 User behavior and how to collect it

2.1 How (I think) Netflix gathers evidence while you browse

2.1.1 The evidence Netflix collects

2.2 Finding useful user behavior

2.2.1 Capturing visitor impressions

2.2.2 What you can learn from a shop browser

2.2.3 Act of buying

2.2.4 Consuming products

2.2.5 Getting to know your customers the Netflix way

2.3 Identifying users

2.4 Getting visitor data from other sources

2.5 The collector

2.5.1 Build the project files

2.5.2 The Data Model

2.5.3 The snitch—​client-side evidence collector

2.5.4 Integrating the collector into MovieGEEKs

2.6 What is a user in the system And how to model them


3 Monitoring the system

3.1 Why adding a dashboard is a good idea

3.1.1 Answering “How are we doing?”

3.2 Doing the analytics

3.2.1 Web analytics

3.2.2 The basic statistics

3.2.3 Conversions

3.2.4 Analyzing the path up to conversion

3.2.5 Conversion path

3.3 Personas

3.4 MovieGEEKs dashboard

3.4.1 Autogenerating some data to our log

3.4.2 Specification and design of the analytics dashboard

3.4.3 Analytics dashboard wireframe

3.4.4 Architecture


4 On ratings and how to calculate them

4.1 User-item preferences

4.1.1 Definition of ratings

4.1.2 User-item matrix

4.2 Explicit or implicit ratings

4.2.1 How we use trusted sources for recs

4.3 Revisiting explicit ratings

4.4 What are implicit ratings

4.4.1 People suggestions

4.4.2 Considerations of calculating ratings

4.5 Calculating implicit ratings

4.5.1 Looking at the behavioral data

4.5.2 This could be considered a machine-learning problem

4.6 How to implement implicit ratings

4.6.1 Adding the time aspect

4.7 Less frequent items provide more value


5 Non-personalized recommendations

5.1 What’s a non-personalized recommendation

5.1.1 What is a recommendation and what is a commercial.

5.1.2 What is non-personalized recommendation

5.2 How to make recommendations when you don’t have any data.

5.2.1 Top 10 - A chart of Items.

5.3 Implementing the chart and the groundwork for the recommender system component

5.3.1 The recommender system component

5.3.2 Code from Github

5.3.3 A recommender system

5.3.4 Adding chart to Movie Geeks

5.4 Seeded recommendations

5.4.1 Top 10 items bought by same user as the one you are viewing.

5.4.2 Association rules

5.4.3 Implementing association rules

5.4.4 Saving the association rules in the database.

5.4.5 Running the association rules calculator.

5.4.6 Use different events to create the association rules


6 The user (and content) who came in from the cold

6.1 What’s a cold Start?

6.1.1 Cold product

6.1.2 A cold visitor

6.1.3 Gray sheep

6.1.4 Let’s look at some real-life examples

6.1.5 So, what can we do about cold starts?

6.2 Keeping track of visitors

6.2.1 Persisting anonymous users

6.3 Addressing cold-start problems with algorithms

6.3.1 Using association rules to create recs for cold users

6.3.2 Using domain knowledge and business rules

6.3.3 Using Segments

6.3.4 Using categories to get around the Gray Sheep problem and how to introduce cold product

6.4 Those who doesn’t ask, won’t know

6.4.1 When the visitor is not new any longer

6.5 Using association rules to start recommending things fast.

6.5.1 Find the Collected items

6.5.2 Retrieve Association rules and order them according to confidence.

6.5.3 Display the recs.

6.5.4 Implementation evaluation


Part 2: Recommender algorithms

7 Finding similarities between users and between content

7.1 Why similarity?

7.1.1 What is a Similarity function?

7.2 Essential similarity functions

7.2.1 Jaccard distance

7.2.2 Lp-norms

7.2.3 Cosine similarity

7.2.4 Pearson Similarity

7.2.5 Test running Pearson Similarity

7.2.6 Pearson is really similar to cosine

7.3 K-means clustering

7.3.1 k-means clustering algorithm

7.3.2 Translating k-means clustering into Python

7.4 Implementing Similarities

7.4.1 Implement the similarity in MovieGEEKs site

7.4.2 Implement the clustering in MovieGEEKs site


8 Collaborative Filtering in the Neighborhood

8.1 Collaborative filtering: A history lesson

8.1.1 When information became collaboratively filtered

8.1.2 Helping each other

8.1.3 The rating matrix

8.1.4 The collaborative filtering pipeline

8.1.5 User-user collaborative filtering

8.1.6 Data Requirements

8.2 Calculating recommendations

8.3 Calculating similarities

8.4 Amazon’s algorithm to precalculate item similarity

8.5 Ways to select the neighborhood

8.6 Finding the right neighborhood

8.7 Ways to calculate predicted ratings

8.8 Prediction with item-based filtering

8.8.1 Compute item predictions

8.9 Cold-start problems

8.10 A few words on machine learning terms.

8.11 Collaborative filtering on the MovieGEEKs site

8.12 What’s the difference between association rule recs and collaborative recs?

8.13 Levers to fiddle with for collaborative filtering

8.14 Pros and cons of collaborative filtering


9 Evaluating and testing your recommender

9.1 Business wants lift, cross-sales, up-sales, and conversions

9.2 Why is it important to evaluate?

9.3 How to interpret user behavior.

9.4 What to measure

9.4.1 Understanding my taste — minimizing prediction error

9.4.2 Diversity

9.4.3 Coverage

9.4.4 Serendipity

9.5 Before implementing the recommender…​

9.5.1 Verify the algorithm

9.5.2 Regression Testing

9.6 Types of evaluation

9.7 Offline evaluation

9.7.1 What to do when the algorithm doesn’t produce any recommendations

9.8 Offline experiments

9.8.1 Performing the experiment

9.9 Implementing the experiment in MovieGEEKs

9.9.1 What we will implement

9.10 Evaluating the test set

9.10.1 Starting out with the baseline predictor

9.10.2 Finding the right parameters

9.11 Online evaluation

9.11.1 Family and friends

9.11.2 A/B testing

9.12 Continuous testing with exploit/explore

9.12.1 Feedback loops


10 Content-based Filtering

10.1 Descriptive example

10.2 Content-based filtering

10.3 Content Analyzer

10.3.1 Feature extraction for the item profile

10.3.2 Categorical data with small numbers

10.3.3 Converting the year to a comparable feature

10.4 Extracting Metadata from Descriptions

10.4.1 Preparing Descriptions

10.6 Finding important words with Term Frequency — Inverse Document Frequency (TF-IDF)

10.6 Topic modeling using the LDA

10.6.1 What knobs can we turn to tweak the LDA?

10.7 Finding similar content

10.9 Creating the user profile

10.9.1 Creating the user profile with TD-IDF

10.9 Content-based recommendations in MovieGEEKs

10.9.1 Loading data

10.9.2 Train the model

10.9.3 Creating item profiles

10.9.4 Creating user profiles

10.9.5 Showing recommendations

10.12 Evaluation of the content-based recommender

10.11 Pros and Cons for content-based filtering


11 Finding hidden genres with matrix factorization

11.1 Sometimes it’s good to reduce the amount of data

11.2 Example of what we want to solve

11.3 A whiff of linear algebra

11.3.1 Matrix

11.3.2 What is factorization?

11.4 Constructing the factorization using SVD

11.4.1 Adding a new user by folding in

11.4.2 How to do recommendations with SVD

11.4.3 Baseline Predictors

11.4.4 Temporal dynamic

11.5 Constructing the factorization using Funk SVD

11.5.1 Root Mean Squared Error

11.5.2 Gradient Descent

11.5.3 Stochastic Gradient Descent

11.5.4 And finally, to the Factorization

11.5.5 Adding Biases

11.5.6 How to start and when to stop

11.6 Doing recommendations with Funk SVD

11.7 Funk SVD implementation in MovieGEEKs

11.7.1 What to do with outliers

11.7.2 Keeping the model up to date

11.7.3 Faster implementation

11.8 Explicit vs implicit data

11.9 Evaluation

11.10 Levers to fiddle with for Funk SVD


12 Taking the best of all algorithms: Implementing hybrid recommenders

12.1 The confused world of hybrids

12.2 The monolithic

12.2.1 Mixing content-based features with behavioral data to improve collaborative filtering recommenders

12.3 Mixed hybrid recommender

12.4 The ensemble

12.4.1 Switched ensemble recommender

12.4.2 Weighted Ensemble Recommender

12.4.3 Linear Regression

12.5 Feature-weighted linear stacking (FWLS)

12.5.1 Meta features: Weights as functions

12.5.2 The algorithm

12.6 Implementation


13 Ranking and learning to rank

13.1 Learning to rank an example at Foursquare

13.2 Re-ranking

13.3 What’s learning to rank again?

13.3.1 The three types of learning to Rank algorithms

13.4 Bayesian Personalized Ranking

13.4.1 BPR

13.4.2 Math magic (advanced section)

13.4.3 The BPR algorithm

13.4.4 Bayesian Personalized Ranking with Matrix Factorization

13.5 Implementation of BPR

13.5.1 Doing the recommendations

13.6 Evaluation

13.7 Levers to fiddle with for BPR


14 Future of Recommender Systems

14.1 This book in a few sentences

14.2 Topics to study next

14.2.1 Further readings

14.2.2 Algorithms

14.2.3 Context

14.2.4 Human Computer Interactions

14.2.5 Choosing a good architecture

14.3 What’s the future of recommender systems?

14.4 Final Thoughts

What's inside

  • How to collect and understand user behavior
  • Collaborative and content-based filtering
  • Machine learning algorithms
  • Real-world examples in Python

About the reader

Readers need intermediate programming and database skills.

About the author

Kim Falk is an experienced data scientist who works daily with machine learning and recommender systems.

We interviewed Kim as a part of our Six Questions series. Check it out here.

placing your order...

Don't refresh or navigate away from the page.
print book $34.99 $49.99 pBook + eBook + liveBook
Additional shipping charges may apply
Practical Recommender Systems (print book) added to cart
continue shopping
go to cart

eBook $27.99 $39.99 3 formats + liveBook
Practical Recommender Systems (eBook) added to cart
continue shopping
go to cart

Prices displayed in rupees will be charged in USD when you check out.

FREE domestic shipping on three or more pBooks