Practical Recommender Systems goes behind the curtain to show you how recommender systems work and, more importantly, how to create and apply them for your site. After you've covered the basics of how recommender systems work, you'll discover how to collect user data and produce personalized recommendations. Next, you'll learn how and where to use the most popular recommendation algorithms and see examples of them in action on sites like Amazon and Netflix. Finally, this hands-on guide covers scaling problems and other issues you may encounter as your site grows.
1. What is a Recommender
1.1. Real-life recommendations
1.1.1. Recommender systems are at home on the internet
1.1.2. The Netflix recommender system
1.1.3. Recommender system definition
1.2. Taxonomy of recommender systems
1.2.4. Personalization level
1.2.5. Whose opinions
1.2.6. Privacy and trustworthiness
1.3. Machine learning and the Netflix Prize
1.4. The Movie GEEKs website
1.4.1. Design and specification
1.5. Building a recommender system
Part 1: Introduction to Recommender Systems
2. User behavior and how to collect it
2.1. How (I think) Netflix gathers evidence while you browse
2.1.1. The evidence Netflix collects
2.2. Finding useful user behavior
2.2.1. What you can learn from a browser
2.2.2. Act of buying
2.2.3. Consuming products
2.2.4. Visitor ratings
2.2.5. Getting to know your customers the Netflix way
2.3. Identifying users
2.4. Getting visitor data from other sources
2.5. The collector
2.5.1. Build the project files
2.5.2. The Data Model
2.5.3. The snitch - client-side evidence collector
2.6. Integrate the collector into MovieGEEK
2.7. What is a user in the system And how to model them
3. Analytics primer and implementing a dashboard
3.1. Why adding a dashboard is a good Idea
3.1.1. Answering "How are we doing?"
3.2. Doing the analytics
3.2.1. Web analytics
3.2.2. The basic statistics
3.2.4. Analyzing the path up to conversion
3.2.5. Conversion path
3.3. MovieGEEKs dashboard.
3.3.1. Specification and design of the analytics dashboard
3.3.2. Analytics dashboard wireframe
3.4. Summary and what's to come
4. On ratings and how to calculate them
4.1. User-item preferences
4.1.1. Definition of ratings
4.1.2. User-item matrix
4.2. Explicit or Implicit Ratings.
4.2.1. How we use trusted sources for recs
4.3. Revisiting explicit ratings
4.4. What are implicit ratings
4.4.1. People suggestions
4.4.2. Considerations of calculating ratings
4.5. Calculating implicit ratings
4.5.1. Looking at the behavioral data
4.5.2. This could be considered a machine-learning problem
4.6. How to implement these calculations implicit ratings
4.6.1. Adding the time aspect
4.7. Less frequent items provide more value
5. Non-personalized recommendations
5.1. What is a non-personalized recommendation
5.1.1. What is a recommendation and what is a commercial.
5.1.2. What is non-personalized recommendation
5.2. How to make recommendations when you don’t have any data.
5.3. Top 10 - A chart of Items.
5.4. Implementing the chart and, in the process, the groundwork for the Recommender system component
5.4.1. The recommender system component
5.4.2. Code from Github
5.4.3. A recommender system
5.4.4. Adding chart to Movie Geeks
5.4.5. Making the content look more attractive
5.5. Seeded recommendations
5.5.1. Top 10 items bought by same user as the one you are viewing.
5.5.2. Association rules
5.5.3. Implementing association rules
5.5.4. Saving the association rules in the database.
5.5.5. Use different events to create the association rules
6. The user (and content) who came in from the Cold
6.1. What is a cold Start?
6.1.1. Cold product
6.1.2. A cold visitor
6.1.3. Gray sheep
6.1.4. Let’s look at some real-life examples
6.1.5. So, what can we do about cold starts?
6.2. Keeping track of visitors
6.2.1. Persisting anonymous users
6.3. Three ways to address cold start problem with algorithms
6.3.1. Using association rules to create recs for cold users
6.3.2. Using domain knowledge and business rules
6.3.3. Using Segments
6.3.4. A possible way to get around the Gray Sheep problem and how to introduce cold product
6.4. He who does not ask, will not know
6.5. When the visitor is not new any longer
6.6. Implementing Greeting visitors for the first time with association rules.
6.6.1. Find the Collected items
6.6.2. Retrieve Association rules and order them according to confidence.
6.6.3. Display the recs.
6.6.4. Implementation evaluation
7. Finding similarities between users and between content
7.1. Why we need to talk about similarity?
7.1.1. What is a Similarity function?
7.2. Essential similarity functions
7.2.1. Jaccard distance
7.2.3. Cosine similarity
7.2.4. Pearson Similarity
7.2.5. Test running Pearson Similarity
7.2.6. Pearson is really similar to cosine
7.3. K-means clustering
7.3.1. k-means clustering algorithm
7.3.2. Translating k-means clustering into Python
7.4. Implementing Similarities
7.4.1. Implement the similarity in MovieGEEKs site
7.4.2. Implement the clustering in MovieGEEKs site
8. Collaborative Filtering in the Neighborhood
8.1. What is collaborative filtering
8.1.1. When information became collaboratively filtered
8.1.2. Helping each other
8.1.3. The rating matrix
8.1.4. The collaborative filtering pipeline
8.1.5. User-user collaborative filtering
8.1.6. Data Requirements
8.2. Calculate Recommendations
8.3. Calculating the similarities
8.4. Amazons algorithm to pre-calculate item similarity
8.5. Ways to select the neighborhood
8.6. Finding the right neighborhood
8.7. Ways to calculate predicted ratings
8.8. Prediction with item-based filtering
8.8.1. Compute item predictions
8.9. Cold start problems
8.10. A few words on machine learning terms.
8.11. Collaborative filtering on the MovieGEEK site
8.11.1. Item based filtering
8.12. What is the difference between association rule recs and collaborative recs?
9. Evaluating and testing your recommender
9.1. Business wants lift, cross-sales, up-sales, and conversions
9.2. Why is it important to evaluate?
9.3. What to measure
9.3.1. Understanding my taste - minimizing prediction error
9.4. Even before implementing the recommender
9.4.1. Verify the algorithm
9.4.2. Regression Testing
9.5. Types of evaluation
9.6. Offline evaluation
9.7. What to do when the algorithm doesn’t produce any recommendations
9.8. Offline experiments
9.8.1. Performing the experiment
9.9. Implementing the experiment
9.9.1. What we will implement
9.10. Controlled experiments
9.10.1. Family and friends
9.11. A/B testing
9.12. Continuous testing with exploit/explore
10. Content-based Filtering
10.2. Descriptive example
10.3. Content-based filtering
10.4. Content Analyzer
10.4.1. Feature extraction for the item profile
10.4.2. Categorical data with small numbers
10.4.3. Converting the year to a comparable feature
10.5. Extracting Metadata from Descriptions
10.5.1. Preparing Descriptions
10.5.2. The professional Netflix watchers
10.6. Finding important words with Term Frequency - Inverse Document Frequency (TF-IDF)
10.7. Topic modeling using the Latent Dirichlet Allocation (LDA)
10.7.1. What knobs can we turn to tweak the LDA?
10.8. Finding similar content
10.9. Creating the user profile
10.10. Content-based recommendations in MovieGEEKs
10.10.1. Loading data
10.10.2. Train the model
10.10.3. Creating item profiles
10.10.4. Creating user profiles
10.10.5. Showing recs
10.11. Evaluation of the content-based recommender
10.12. Pros and Cons for content-based filtering.
11. Finding hidden genres with Matrix Factorization
11.2. Sometimes it’s good to reduce the size of the data
11.3. Example of what we want to solve
11.4. A whiff of linear algebra
11.4.2. What is factorization?
11.5. Constructing the factorization using SVD
11.5.1. Adding a new user by folding in
11.5.2. How to do recommendations with SVD
11.5.3. Baseline Predictors
11.6. Constructing the factorization using FunkSVD
11.6.1. Root Mean Squared Error
11.6.2. Gradient Descent
11.6.3. Stochastic Gradient Descent
11.6.4. And finally, to the Factorization
11.6.5. Adding Biases
11.6.6. When to stop
11.7. Doing recommendations with FunkSVD
11.8. Funk SVD implementation in MovieGEEKs
11.8.1. Keeping the model up to date.
11.8.2. Faster implementation.
12. Taking the best of all algorithms - implementing hybrid recommenders
12.1. The confused world of hybrids
12.2. The Monolithic
12.2.1. Mixing features from content-based features with behavioral data to improve collaborative filtering recommenders.
12.3. Mixed Hybrid Recommender
12.4. The Ensemble
12.4.1. Switched ensemble recommender
12.4.2. Weighted Ensemble Recommender
12.5. Feature-Weighted Linear Stacking
12.6. Meta-features - Weights as functions
12.6.1. The algorithm
13. Ranking and Learning to Rank
13.2. Learning to Rank example at Foursquare
13.4. What is learning to rank?
13.4.1. The three types of learning to Rank algorithms
13.5. Bayesian Personalized Ranking ==== BPR
13.5.1. Math magic (advanced section)
13.5.2. The BPR algorithm
13.5.3. Bayesian Personalized Ranking with Matrix Factorization
13.6. Implementation of BPR
14. Future of Recommender Systems
14.1. This Book in a Few Sentences
14.2. So which of the algorithms should you start out implementing?
14.3. Topics to learn next
14.3.1. Further readings
14.3.4. Human Computer Interactions
14.3.5. Choosing a good architecture
14.4. What is the future of recommender systems?
14.5. Final Thoughts
About the Technology
Recommender systems are everywhere, helping you find everything from movies to jobs, restaurants to hospitals, even romance. Using behavioral and demographic data, these systems make predictions about what users will be most interested in at a particular time, resulting in high-quality, ordered, personalized suggestions. Recommender systems are practically a necessity for keeping your site content current, useful, and interesting to your visitors.
- Practical introduction to recommender system algorithms
- Collaborative and content-based filtering
- Creating individual recommendations from visitor data
- Real-world examples of recommender systems
About the reader
This book assumes you're comfortable reading code in Python and have some experience with databases.