Practical Recommender Systems goes behind the curtain to show you how recommender systems work and, more importantly, how to create and apply them for your site. After you've covered the basics of how recommender systems work, you'll discover how to collect user data and produce personalized recommendations. Next, you'll learn how and where to use the most popular recommendation algorithms and see examples of them in action on sites like Amazon and Netflix. Finally, this hands-on guide covers scaling problems and other issues you may encounter as your site grows.
1.1. Real-life recommender system
1.1.1. Recommender systems are at home on the internet
1.1.2. The Netflix recommender system
1.1.3. Recommender system definition
1.2. Taxonomy of recommender systems
1.2.4. Personalization level
1.2.5. Whose opinions
1.2.6. Privacy and trustworthiness
1.3. Machine learning and the Netflix Prize
1.4. The Movie GEEKs website
1.4.1. Design and specification
Part 1: Introduction to Recommender Systems
2. User behavior and how to collect it
2.1. How (I think) Netflix gathers evidence while you browse
2.1.1. The evidence Netflix collects
2.2. Finding useful user behavior
2.2.1. Capturing visitor impressions
2.2.2. What you can learn from a browser
2.2.3. Act of buying
2.2.4. Consuming products
2.2.5. Getting to know your customers the Netflix way
2.2.6. Identifying users
2.3. Getting visitor data from other sources
2.4. The collector
2.4.1. Build the project files
2.4.2. The snitchclient-side evidence collector
2.5. Integrate the collector into MovieGEEK
2.6. What is a user in the system And how to model them
3. Analytics primer and implementing a dashboard
3.1. Why adding a dashboard is a good idea
3.1.1. Answering "How are we doing?"
3.2. Doing the analytics
3.2.1. Web analytics
3.2.2. The basics statistics
3.2.4. Analyzing the path up to conversion
3.2.5. Conversion path
3.3. MovieGEEKs dashboard.
3.3.1. Specification and design of the analytics dashboard
3.3.2. Analytics dashboard wireframe
3.4. Summary and what's to come
4. On ratings and how to calculate them
4.1. User-item preferences
4.1.1. Definition of ratings
4.1.2. User-item matrix
4.2. What data can be trusted.
4.2.1. How we use trusted sources for recs
4.3. Revisiting explicit ratings
4.4. What are implicit ratings
4.4.1. People suggestions
4.4.2. Considerations of calculating ratings
4.5. Calculating implicit ratings
4.5.1. Looking at the behavioral data
4.5.2. This could be considered a machine-learning problem
4.6. How to implement these calculations implicit ratings
4.6.1. Adding the time aspect
5. Non-personalized recommendations
5.1. What is a non-personalized recommendation
5.1.1. What is a recommendation and what is a commercial.
5.1.2. What is non-personalized recommendation
5.2. How to make recommendations when you don't have any data.
5.3. Top 10 - A chart of Items.
5.4. Implementing the chart and, in the process, the groundwork for the Recommender system component
5.4.1. The recommender system component
5.4.2. Code from Github
5.4.3. A recommender system
5.4.4. Adding chart to Movie Geeks
5.5. Seeded recommendations
5.5.1. Top 10 items bought by same user as the one you are viewing.
5.5.2. Association rules
5.5.3. Implementing association rules
5.5.4. Saving the association rules in the database.
5.5.5. Use different events to create the association rules
6. The user (and content) who came in from the Cold
6.1. What is a Cold Start?
6.1.1. Cold product
6.1.2. A cold visitor
6.1.3. Gray sheep
6.1.4. So what can we do about cold starts?
6.2. Keeping track of visitors
6.2.1. Persisting anonymous users
6.3. Three ways to address cold start problem with algorithms.
6.3.1. Using Association Rules to create recs for cold users.
6.3.2. Using domain knowledge and Business rules.
6.3.3. Using Segments
6.3.4. A possible way to get around the Gray Sheep problem and how to introduce cold product
6.4. He who does not ask, will not know
6.4.1. When the visitor is not new any longer
6.5. Implementing Greeting visitors for the first time with association rules.
6.5.1. Find the Collected items
6.5.2. Retrieve Association rules and order them according to confidence.
6.5.3. Display the recs.
6.5.4. Implementation evaluation
7. Finding similarities between users and between content
7.1. Why do we need to talk about Similarity?
7.1.1. What is a Similarity functions
7.2. Essential Similarity functions?
7.2.1. Jaccard distance
7.2.3. Cosine similarity
7.2.4. Pearson Similarity
7.2.5. Test running Pearson Similarity
7.2.6. Pearson is really similar to cosine:
7.3. K-means clustering
7.3.1. k-means clustering Algorithm
7.3.2. Translating k-means clustering into Python
7.4. Implementing Similarities
7.4.1. Implement the similarity in MovieGEEKs site
7.4.2. Implement the clustering in MovieGEEKs site
8. Collaborative Filtering in the Neighborhood
8.1. What is collaborative filtering
8.1.1. When information became collaborative filtered
8.1.2. Helping each other
8.1.3. The rating matrix
8.1.4. The collaborative filtering pipeline
8.1.5. User-user collaborative filtering
8.1.6. Data Requirements
8.2. Calculate recommendations
8.3. Calculating the similarities
8.4. Amazons algorithm to pre-calculate item similarity
8.5. Ways to select the neighborhood
8.6. Finding the right neighborhood
8.7. Ways to calculate predicted ratings
8.8. Prediction with item based filtering
8.8.1. Compute item predictions
8.9. Cold start problems
8.10. A few words on machine learning terms.
8.11. Collaborative filtering on the MovieGEEK site
8.11.1. Item based filtering
8.12. What is the difference between association rule recs and collaborative recs?
9. Content-based Filtering
9.2. Descriptive example
9.3. Content-based filtering
9.4. Content Analyzer
9.4.1. Feature extraction for the item profile
9.4.2. Categorical Data with small numbers
9.4.3. Converting the year to a comparable feature
9.5. Extracting metadata from descriptions
9.5.1. Preparing descriptions
9.5.2. The professional Netflix watchers
9.6. Finding important words with Term Frequency - Inverse Document Frequency (TF-IDF)
9.7. Topic modeling using the Latent Dirichlet Allocation (LDA)
9.7.1. What numbers can be turned to tweak the LDA
9.8. Finding similar content
9.9. Creating the user profile
9.10. Content based recommendations in MovieGEEKs
9.10.1. Loading data
9.10.2. Train the model
9.10.3. Creating item profiles
9.10.4. Creating user profiles
9.10.5. Showing recs
9.11. Pros and Cons for content-based filtering.
10. Finding hidden genres with Matrix Factorization
10.2. Sometimes it's good to reduce the size of the data
10.3. Example of what we want to solve
10.4. Linear Algebra
10.4.2. What is Factorization
10.5. Constructing the Factorization using SVD
10.5.1. Adding a new user by folding in
10.5.2. How to do recommendations with SVD
10.5.3. Baseline Predictors
10.5.4. Problems with SVD
10.6. Constructing the factorization using FunkSVD
10.6.1. Root Mean Squared Error
10.6.2. Gradient Descent
10.6.3. Stochastic Gradient Descent
10.6.4. And finally to the Factorization
10.6.5. Adding Biases
10.6.6. When to stop
10.7. Doing recommendations with FunkSVD
10.8. Funk SVD implementation in MovieGEEKs
10.8.1. Keeping the model up to date.
11. Taking the best of all algorithms - implementing hybrid recommenders
11.1. The confused world of hybrids
11.2.1. Mixing features from content based features with behavioral data to improve collaborative filtering recommenders.
11.3. Mixed Hybrid Recommender
11.4. Ensemble Recommenders
11.4.1. Switched ensemble recommender
11.4.2. Weighted ensemble recommender
11.5. Feature-Weighted Linear Stacking
11.5.1. Meta-features - Weights as functions
11.5.2. The algorithm
12. Ranking and Learning to Rank
12.2. Leaning to Rank example at Foursquare
12.4. What is Learning to Rank?
12.4.1. The three types of learning to Rank algorithms
12.4.2. Ways to gauge quality of ranking
12.5. How to teach the ranking algorithm
12.6. Bayesian Personalized Ranking
12.6.2. Math magic (advanced section)
12.6.3. The BPR algorithm
12.6.4. Bayesian Personalized Ranking with Matrix Factorization
13. Evaluating and testing your recommender
13.1. Business wants Lift, cross-sales, up-sales and conversions.
13.2. Why is it important to evaluate?
13.3. What to measure
13.3.1. Understanding my taste - minimizing prediction Error
13.4. Before even starting the offline evaluation
13.4.1. Verify Algorithm
13.4.2. Regression Testing
13.5. Offline evaluation.
13.6. Types of Evaluation
13.7. Offline experiments
13.7.1. Performing the experiment
13.7.2. Implementing the experiment
13.8. Controlled experiments
13.8.1. Family and friends
13.9. A/B testing
13.10. Continuous testing with exploit/explore.
14. Future of recommender systems
About the Technology
Recommender systems are everywhere, helping you find everything from movies to jobs, restaurants to hospitals, even romance. Using behavioral and demographic data, these systems make predictions about what users will be most interested in at a particular time, resulting in high-quality, ordered, personalized suggestions. Recommender systems are practically a necessity for keeping your site content current, useful, and interesting to your visitors.
- Practical introduction to recommender system algorithms
- Collaborative and content-based filtering
- Creating individual recommendations from visitor data
- Real-world examples of recommender systems
About the reader
This book assumes you're comfortable reading code in Python and have some experience with databases.