SlideShare a Scribd company logo
Building a Recommendation
system for e-commerce
AI Ukraine 2017
About me
Alex Konduforov
Data Science Group Leader @
Co-organizer @ Kharkiv AI club
Business
“31% of ecommerce revenues were generated from personalized
product recommendations” - Barilliance.com, 2014
“Already, 35% of what consumers purchase on Amazon and 75% of
what they watch on Netflix come from product recommendations
based on such algorithms” - McKinsey
Evolution
Amazon
Netflix
Why recommendations so important
Traditional Retail can serve only
most popular products.
Online can serve much more
products, but it’s overwhelming for
customers.
How to apply
Website recommendations
• Main goals: cross-sale, save customer time
Personalized marketing emails
• Main goals: return customer on the website, upsale
Recommendation systems
Formulation of the problem
LOTR Star Wars GoT Matrix
Alice 5 2
Bob 4 5 3
Carol 3 4
David 5 5
Goal of recommendation system is to predict blanks in the utility matrix
Gathering Data
Explicit
• Ask people to rate items
• Cons: doesn’t scale, only a small fractions of users leave
ratings and reviews
Implicit
• Inferences from user actions
• Cons: only one value, no difference between dislike and
unknown
Main approaches
• Non-personalized Summary Statistics
• Content-based Filtering
• Collaborative Filtering (nearest neighbors)
• User-User
• Item-Item
• Matrix Factorization
• Hybrid
• Probability models
• etc.
Ecommerce specifics
• Implicit customer feedback (views, purchases, other actions)
• Utility matrix with only 1’s
• Possible to calculate some score but more complicated
• Collaborative Filtering + Matrix Factorization
• Not every similarity/distance works
Collaborative Filtering
Method of making automatic predictions (filtering) about the interests of a user
by collecting preferences or taste information from many users (collaborating)
Types of Collaborative Filtering
User-to-user
1. Look for users who share the same rating patterns with the active user (the
user whom the prediction is for)
2. Use the ratings from those like-minded users to calculate a prediction for
the active user
Item-to-item
1. Build an item-item matrix determining relationships between pairs of items
2. Infer the tastes of the current user by examining the matrix and matching
that user's data
User-to-user
Similarity
Jaccard similarity
Ignores rating values
Pearson similarity (~cosine)
Contrary to cosine treats missing values
not as negatives, but as zeros
Item-to-item
User-based vs. item-based
In practice, item-based CF outperforms user-based CF in many cases
Item-based CF pros:
• better when user size is large
• better for new users
• no need to recalculate so often as user-based (caching)
• more likely to converge => better accuracy
Matrix Factorization
Approximates the utility matrix as product of low-rank matrices
Identifies latent features
P x Q
Matrix Factorization algorithm
• Initialize P and Q with small random numbers
• Teach P and Q
• Alternating Least Squares
• Stochastic Gradient Descent
MF example
Latent features are calculated via MF:
Evaluation
Academic metrics:
• RMSE
• MAE
• Precision/Recall
(all may have low correlation with
actual user satisfaction)
Business metrics:
• CTR/CVR
• ROI
• CLV (Customer Lifetime Value)
Customer metrics:
• Coverage – covering more items for recommendations
• Diversity – higher variety of items (rich-get-richer effect)
• Novelty – recommending new items
Sparsity problem
There is an approximate threshold of 99.5% sparsity for CF to work
• Add product views, shopping cart and other activities
• Decreases sparsity
• Matrix Factorization, SVD
• No zeros
• Content description
• Hybrid content-based + collaborative filtering
Cold start problem
User cold start: new users
• Non-personalized recommendations: most popular, highly rated
• Use user profile (age, gender, etc.) and segment
Item cold start: new items
• Don’t recommend (what about news?)
• Use item content if available
Scalability problem
Amazon had 30+ mln of customers and several million catalog items.
Solution:
• Reduce number of customers by randomly sampling them or discarding
customers with few purchases
• Reduce number of items by discarding very popular or unpopular items
• Dimensionality reduction techniques such as clustering
Other challenges
• Gray sheep
• Diversity and the long tail (rich-get-richer effect)
• Shilling attacks
• Privacy
• EU has quite strict rules and culture of data privacy
• Netflix was sued for dataset publication => cancellation of a second Netflix
Prize competition in 2010
Implementation questions
• For CF+MF - automatic model updates? how frequently?
• How and where to store MF model?
• Emails - track recommended items and don’t duplicate
Tools
Language / Stack Tools / Libraries
R recommenderlab, recosystem
Python Scikit-learn
crab, implicit, python-recsys, Surprise
GraphLab Create ($$$)
Java LensKit, Cofi
Apache Mahout
C++ SVDFeature, Waffles, Graphchi, LIBMF
GraphLab Create ($$$)
C# Nreco
Node.JS raccoon
SaaS Google Cloud Prediction API
Amazon Machine Learning
PredictionIO
SuggestGrid
https://github.com/grahamjenson/list_of_recommender_systems
Materials
• A Gentle Introduction to Recommender Systems with Implicit Feedback
• Matrix Factorization: A Simple Tutorial and Implementation on Python
• Matrix Factorization Model in Collaborating Filtering
• Finding similar music using Matrix Factorization
• Mining of Massive Databases (Stanford), Chapter 9
• AI Ukraine 2014 - Сергей Николенко - Рекомендательные системы
• Recommender Systems specialization (Coursera)
Thank you!
Skype: alex_konduforov
Email: alex.konduforov@altexsoft.com

More Related Content

PDF
Amazon Personalize 소개 (+ 실습 구성)::김영진, 솔루션즈 아키텍트, AWS::AWS AIML 스페셜 웨비나
Amazon Web Services Korea
 
PPTX
Dns presentation
Anurag Pandey
 
PPTX
데이터 기반 이커머스 개인화 추천 기획 | 마켓컬리 Market Kurly
Amelia Choi
 
PDF
웹서버 부하테스트 실전 노하우
IMQA
 
PPTX
Recommendation System
Anamta Sayyed
 
PDF
Data Science, Machine Learning and Neural Networks
BICA Labs
 
PPT
Geostationary Satellite
Waqas !!!!
 
PDF
Tutorial: Context In Recommender Systems
YONG ZHENG
 
Amazon Personalize 소개 (+ 실습 구성)::김영진, 솔루션즈 아키텍트, AWS::AWS AIML 스페셜 웨비나
Amazon Web Services Korea
 
Dns presentation
Anurag Pandey
 
데이터 기반 이커머스 개인화 추천 기획 | 마켓컬리 Market Kurly
Amelia Choi
 
웹서버 부하테스트 실전 노하우
IMQA
 
Recommendation System
Anamta Sayyed
 
Data Science, Machine Learning and Neural Networks
BICA Labs
 
Geostationary Satellite
Waqas !!!!
 
Tutorial: Context In Recommender Systems
YONG ZHENG
 

What's hot (20)

PDF
Recommendation System Explained
Crossing Minds
 
PPTX
Recommender systems: Content-based and collaborative filtering
Viet-Trung TRAN
 
PPTX
Recommendation Systems Basics
Jarin Tasnim Khan
 
PDF
Recent advances in deep recommender systems
NAVER Engineering
 
PDF
Recommender Systems
Carlos Castillo (ChaTo)
 
PPT
Recommendation system for ecommerce
Tu Pham
 
PDF
Recommender system algorithm and architecture
Liang Xiang
 
PPT
Amazon Item-to-Item Recommendations
Roger Chen
 
PPTX
Recommendation system
Rishabh Mehta
 
PPTX
Recommender system introduction
Liang Xiang
 
PDF
Boston ML - Architecting Recommender Systems
James Kirk
 
PPTX
Recommender Systems
Girish Khanzode
 
PPTX
Recommendation system
Akshat Thakar
 
PDF
Introduction to Recommendation Systems
Trieu Nguyen
 
PPT
Recommendation system
Vikrant Arya
 
PPTX
Recommendation Systems
Robin Reni
 
PPTX
Recommender system
Nilotpal Pramanik
 
PDF
Recommender Systems
Francesco Casalegno
 
PPTX
Recommender systems using collaborative filtering
D Yogendra Rao
 
PDF
An introduction to Recommender Systems
David Zibriczky
 
Recommendation System Explained
Crossing Minds
 
Recommender systems: Content-based and collaborative filtering
Viet-Trung TRAN
 
Recommendation Systems Basics
Jarin Tasnim Khan
 
Recent advances in deep recommender systems
NAVER Engineering
 
Recommender Systems
Carlos Castillo (ChaTo)
 
Recommendation system for ecommerce
Tu Pham
 
Recommender system algorithm and architecture
Liang Xiang
 
Amazon Item-to-Item Recommendations
Roger Chen
 
Recommendation system
Rishabh Mehta
 
Recommender system introduction
Liang Xiang
 
Boston ML - Architecting Recommender Systems
James Kirk
 
Recommender Systems
Girish Khanzode
 
Recommendation system
Akshat Thakar
 
Introduction to Recommendation Systems
Trieu Nguyen
 
Recommendation system
Vikrant Arya
 
Recommendation Systems
Robin Reni
 
Recommender system
Nilotpal Pramanik
 
Recommender Systems
Francesco Casalegno
 
Recommender systems using collaborative filtering
D Yogendra Rao
 
An introduction to Recommender Systems
David Zibriczky
 
Ad

Similar to Recommender systems for E-commerce (20)

PDF
Demystifying Recommendation Systems
Rumman Chowdhury
 
PDF
Overview of recommender system
Stanley Wang
 
PDF
Big data certification training mumbai
TejaspathiLV
 
PDF
Best data science courses in pune
prathyusha1234
 
PDF
best online data science courses
prathyusha1234
 
PDF
Top data science institutes in hyderabad
prathyusha1234
 
PPT
Cs548 s15 showcase_web_mining
Aravindharamanan S
 
PPTX
Building High Available and Scalable Machine Learning Applications
Yalçın Yenigün
 
PPTX
Lecture Notes on Recommender System Introduction
PerumalPitchandi
 
PPTX
Preference Elicitation Interface
晓愚 孟
 
PDF
Fashiondatasc
Suman Bhattacharya, PhD
 
PPT
recommendation system a topic in marketing analytics
PriyadharshiniG41
 
PDF
recommendationsystem-140410131156-phpapp01 (1).pdf
ssuserff0096
 
PPTX
Mini-training: Personalization & Recommendation Demystified
Betclic Everest Group Tech Team
 
PPTX
recommendation system techunique and issue
NutanBhor
 
PDF
PPT by Jannach_organized.pdf presentation on the recommendation
sai419417
 
PDF
Use of data science in recommendation system
AkashPatil334
 
PPTX
Utilizing Marginal Net Utility for Recommendation in E-commerce
Liangjie Hong
 
PPT
Recommender lecture
Aravindharamanan S
 
PPTX
Олександр Обєдніков “Рекомендательные системы”
Dakiry
 
Demystifying Recommendation Systems
Rumman Chowdhury
 
Overview of recommender system
Stanley Wang
 
Big data certification training mumbai
TejaspathiLV
 
Best data science courses in pune
prathyusha1234
 
best online data science courses
prathyusha1234
 
Top data science institutes in hyderabad
prathyusha1234
 
Cs548 s15 showcase_web_mining
Aravindharamanan S
 
Building High Available and Scalable Machine Learning Applications
Yalçın Yenigün
 
Lecture Notes on Recommender System Introduction
PerumalPitchandi
 
Preference Elicitation Interface
晓愚 孟
 
recommendation system a topic in marketing analytics
PriyadharshiniG41
 
recommendationsystem-140410131156-phpapp01 (1).pdf
ssuserff0096
 
Mini-training: Personalization & Recommendation Demystified
Betclic Everest Group Tech Team
 
recommendation system techunique and issue
NutanBhor
 
PPT by Jannach_organized.pdf presentation on the recommendation
sai419417
 
Use of data science in recommendation system
AkashPatil334
 
Utilizing Marginal Net Utility for Recommendation in E-commerce
Liangjie Hong
 
Recommender lecture
Aravindharamanan S
 
Олександр Обєдніков “Рекомендательные системы”
Dakiry
 
Ad

More from Alexander Konduforov (8)

PDF
Fast data munging in R
Alexander Konduforov
 
PPTX
Machine Learning and Azure Machine Learning
Alexander Konduforov
 
PPTX
ИИ персонажей в онлайн шутере Survarium
Alexander Konduforov
 
PPTX
Real-time ASP.NET with SignalR
Alexander Konduforov
 
PPT
Design Principles
Alexander Konduforov
 
PPTX
New in Entity Framework 4.0
Alexander Konduforov
 
PPT
ASP.NET MVC: new era?
Alexander Konduforov
 
PPT
Ling to SQL and Entity Framework performance analysis
Alexander Konduforov
 
Fast data munging in R
Alexander Konduforov
 
Machine Learning and Azure Machine Learning
Alexander Konduforov
 
ИИ персонажей в онлайн шутере Survarium
Alexander Konduforov
 
Real-time ASP.NET with SignalR
Alexander Konduforov
 
Design Principles
Alexander Konduforov
 
New in Entity Framework 4.0
Alexander Konduforov
 
ASP.NET MVC: new era?
Alexander Konduforov
 
Ling to SQL and Entity Framework performance analysis
Alexander Konduforov
 

Recently uploaded (20)

PPTX
M1-T1.pptxM1-T1.pptxM1-T1.pptxM1-T1.pptx
teodoroferiarevanojr
 
PPTX
White Blue Simple Modern Enhancing Sales Strategy Presentation_20250724_21093...
RamNeymarjr
 
PDF
Mastering Financial Analysis Materials.pdf
SalamiAbdullahi
 
PDF
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
PDF
Technical Writing Module-I Complete Notes.pdf
VedprakashArya13
 
PPT
From Vision to Reality: The Digital India Revolution
Harsh Bharvadiya
 
PDF
Blitz Campinas - Dia 24 de maio - Piettro.pdf
fabigreek
 
PPTX
Fluvial_Civilizations_Presentation (1).pptx
alisslovemendoza7
 
PPTX
Introduction to computer chapter one 2017.pptx
mensunmarley
 
PPTX
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
PPTX
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
PPT
Real Life Application of Set theory, Relations and Functions
manavparmar205
 
PDF
Practical Measurement Systems Analysis (Gage R&R) for design
Rob Schubert
 
PPTX
Data-Driven Machine Learning for Rail Infrastructure Health Monitoring
Sione Palu
 
PPTX
The whitetiger novel review for collegeassignment.pptx
DhruvPatel754154
 
PPTX
Web dev -ppt that helps us understand web technology
shubhragoyal12
 
PPTX
World-population.pptx fire bunberbpeople
umutunsalnsl4402
 
PPTX
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 
PPTX
Future_of_AI_Presentation for everyone.pptx
boranamanju07
 
PPTX
HSE WEEKLY REPORT for dummies and lazzzzy.pptx
ahmedibrahim691723
 
M1-T1.pptxM1-T1.pptxM1-T1.pptxM1-T1.pptx
teodoroferiarevanojr
 
White Blue Simple Modern Enhancing Sales Strategy Presentation_20250724_21093...
RamNeymarjr
 
Mastering Financial Analysis Materials.pdf
SalamiAbdullahi
 
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
Technical Writing Module-I Complete Notes.pdf
VedprakashArya13
 
From Vision to Reality: The Digital India Revolution
Harsh Bharvadiya
 
Blitz Campinas - Dia 24 de maio - Piettro.pdf
fabigreek
 
Fluvial_Civilizations_Presentation (1).pptx
alisslovemendoza7
 
Introduction to computer chapter one 2017.pptx
mensunmarley
 
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
Real Life Application of Set theory, Relations and Functions
manavparmar205
 
Practical Measurement Systems Analysis (Gage R&R) for design
Rob Schubert
 
Data-Driven Machine Learning for Rail Infrastructure Health Monitoring
Sione Palu
 
The whitetiger novel review for collegeassignment.pptx
DhruvPatel754154
 
Web dev -ppt that helps us understand web technology
shubhragoyal12
 
World-population.pptx fire bunberbpeople
umutunsalnsl4402
 
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 
Future_of_AI_Presentation for everyone.pptx
boranamanju07
 
HSE WEEKLY REPORT for dummies and lazzzzy.pptx
ahmedibrahim691723
 

Recommender systems for E-commerce

  • 1. Building a Recommendation system for e-commerce AI Ukraine 2017
  • 2. About me Alex Konduforov Data Science Group Leader @ Co-organizer @ Kharkiv AI club
  • 3. Business “31% of ecommerce revenues were generated from personalized product recommendations” - Barilliance.com, 2014 “Already, 35% of what consumers purchase on Amazon and 75% of what they watch on Netflix come from product recommendations based on such algorithms” - McKinsey
  • 5. Why recommendations so important Traditional Retail can serve only most popular products. Online can serve much more products, but it’s overwhelming for customers.
  • 6. How to apply Website recommendations • Main goals: cross-sale, save customer time Personalized marketing emails • Main goals: return customer on the website, upsale
  • 8. Formulation of the problem LOTR Star Wars GoT Matrix Alice 5 2 Bob 4 5 3 Carol 3 4 David 5 5 Goal of recommendation system is to predict blanks in the utility matrix
  • 9. Gathering Data Explicit • Ask people to rate items • Cons: doesn’t scale, only a small fractions of users leave ratings and reviews Implicit • Inferences from user actions • Cons: only one value, no difference between dislike and unknown
  • 10. Main approaches • Non-personalized Summary Statistics • Content-based Filtering • Collaborative Filtering (nearest neighbors) • User-User • Item-Item • Matrix Factorization • Hybrid • Probability models • etc.
  • 11. Ecommerce specifics • Implicit customer feedback (views, purchases, other actions) • Utility matrix with only 1’s • Possible to calculate some score but more complicated • Collaborative Filtering + Matrix Factorization • Not every similarity/distance works
  • 12. Collaborative Filtering Method of making automatic predictions (filtering) about the interests of a user by collecting preferences or taste information from many users (collaborating)
  • 13. Types of Collaborative Filtering User-to-user 1. Look for users who share the same rating patterns with the active user (the user whom the prediction is for) 2. Use the ratings from those like-minded users to calculate a prediction for the active user Item-to-item 1. Build an item-item matrix determining relationships between pairs of items 2. Infer the tastes of the current user by examining the matrix and matching that user's data
  • 15. Similarity Jaccard similarity Ignores rating values Pearson similarity (~cosine) Contrary to cosine treats missing values not as negatives, but as zeros
  • 17. User-based vs. item-based In practice, item-based CF outperforms user-based CF in many cases Item-based CF pros: • better when user size is large • better for new users • no need to recalculate so often as user-based (caching) • more likely to converge => better accuracy
  • 18. Matrix Factorization Approximates the utility matrix as product of low-rank matrices Identifies latent features P x Q
  • 19. Matrix Factorization algorithm • Initialize P and Q with small random numbers • Teach P and Q • Alternating Least Squares • Stochastic Gradient Descent
  • 20. MF example Latent features are calculated via MF:
  • 21. Evaluation Academic metrics: • RMSE • MAE • Precision/Recall (all may have low correlation with actual user satisfaction) Business metrics: • CTR/CVR • ROI • CLV (Customer Lifetime Value) Customer metrics: • Coverage – covering more items for recommendations • Diversity – higher variety of items (rich-get-richer effect) • Novelty – recommending new items
  • 22. Sparsity problem There is an approximate threshold of 99.5% sparsity for CF to work • Add product views, shopping cart and other activities • Decreases sparsity • Matrix Factorization, SVD • No zeros • Content description • Hybrid content-based + collaborative filtering
  • 23. Cold start problem User cold start: new users • Non-personalized recommendations: most popular, highly rated • Use user profile (age, gender, etc.) and segment Item cold start: new items • Don’t recommend (what about news?) • Use item content if available
  • 24. Scalability problem Amazon had 30+ mln of customers and several million catalog items. Solution: • Reduce number of customers by randomly sampling them or discarding customers with few purchases • Reduce number of items by discarding very popular or unpopular items • Dimensionality reduction techniques such as clustering
  • 25. Other challenges • Gray sheep • Diversity and the long tail (rich-get-richer effect) • Shilling attacks • Privacy • EU has quite strict rules and culture of data privacy • Netflix was sued for dataset publication => cancellation of a second Netflix Prize competition in 2010
  • 26. Implementation questions • For CF+MF - automatic model updates? how frequently? • How and where to store MF model? • Emails - track recommended items and don’t duplicate
  • 27. Tools Language / Stack Tools / Libraries R recommenderlab, recosystem Python Scikit-learn crab, implicit, python-recsys, Surprise GraphLab Create ($$$) Java LensKit, Cofi Apache Mahout C++ SVDFeature, Waffles, Graphchi, LIBMF GraphLab Create ($$$) C# Nreco Node.JS raccoon SaaS Google Cloud Prediction API Amazon Machine Learning PredictionIO SuggestGrid https://github.com/grahamjenson/list_of_recommender_systems
  • 28. Materials • A Gentle Introduction to Recommender Systems with Implicit Feedback • Matrix Factorization: A Simple Tutorial and Implementation on Python • Matrix Factorization Model in Collaborating Filtering • Finding similar music using Matrix Factorization • Mining of Massive Databases (Stanford), Chapter 9 • AI Ukraine 2014 - Сергей Николенко - Рекомендательные системы • Recommender Systems specialization (Coursera)

Editor's Notes

  • #22: Recall - % relevant documents