SlideShare a Scribd company logo
Lessons Learned from
Building real-life Recsys
Xavier Amatriain (Quora)
Deepak Agarwal (LinkedIn)
4. Recommendations
@Quora
Our Mission
“To share and grow the world’s
knowledge”
• Millions of questions & answers
• Millions of users
• Thousands of topics
• ...
Demand
What we care about
Quality
Relevance
Data @ Quora
Lots of high-quality textual information
Lots of data relations
Recommendations
@Quora
Recommendations at Quora
● Homepage feed ranking
● Email digest
● Answer ranking
● Topic recommendation
● User recommendation
● Trending Topics
● Automated Topic Labelling
● Related Question
● ...
click
upvote
downvote
expand
share
Models
● Deep Neural Networks
● Logistic Regression
● Elastic Nets
● Gradient Boosted Decision Trees
● Random Forests
● LambdaMART
● Matrix Factorization
● LDA
● ...
●
5. Lessons learned
(Part II)
1. Implicitsignalsbeat
explicitones
(almostalways)
Implicit vs. Explicit
● Many have acknowledged
that implicit feedback is more
useful
● Is implicit feedback really always
more useful?
● If so, why?
● Implicit data is (usually):
○ More dense, and available for all users
○ Better representative of user behavior vs.
user reflection
○ More related to final objective function
○ Better correlated with AB test results
● E.g. Rating vs watching
Implicit vs. Explicit
● However
○ It is not always the case that
direct implicit feedback correlates
well with long-term retention
○ E.g. clickbait
● Solution:
○ Combine different forms of
implicit + explicit to better represent
long-term goal
Implicit vs. Explicit
2.bethoughtfulaboutyour
TrainingData
Defining training/testing data
● Training a simple binary classifier for
good/bad answer
○ Defining positive and negative labels ->
Non-trivial task
○ Is this a positive or a negative?
■ funny uninformative answer with many
upvotes
■ short uninformative answer by a well-known
expert in the field
■ very long informative answer that nobody
reads/upvotes
■ informative answer with grammar/spelling
mistakes
■ ...
3.YourModelwilllearn
whatyouteachittolearn
Training a model
● Model will learn according to:
○ Training data (e.g. implicit and explicit)
○ Target function (e.g. probability of user reading an answer)
○ Metric (e.g. precision vs. recall)
● Example 1 (made up):
○ Optimize probability of a user going to the cinema to
watch a movie and rate it “highly” by using purchase history
and previous ratings. Use NDCG of the ranking as final
metric using only movies rated 4 or higher as positives.
Example 2 - Quora’s feed
● Training data = implicit + explicit
● Target function: Value of showing a
story to a
user ~ weighted sum of actions:
v = ∑a
va
1{ya
= 1}
○ predict probabilities for each action, then compute expected
value: v_pred = E[ V | x ] = ∑a
va
p(a | x)
● Metric: any ranking metric
4.Explanationsmightmatter
morethantheprediction
Explanation/Support for Recommendations
Social Support
5.IfYouHavetoPickonesingleapproach,
Matrixfactorizationisyourbestbet
Matrix Factorization
● MF can be interpreted as
○ Unsupervised:
■ Dimensionality Reduction a la PCA
■ Clustering (e.g. NMF)
○ Supervised:
■ Labeled targets ~ regression
● Very useful variations of MF
○ BPR, ALS, SVD++
○ Tensor Factorization, Factorization Machines
● However...
6.Everythingisanensemble
Ensembles
● Netflix Prize was won by an ensemble
○ Initially Bellkor was using GDBTs
○ BigChaos introduced ANN-based ensemble
● Most practical applications of ML run an ensemble
○ Why wouldn’t you?
○ At least as good as the best of your methods
○ Can add completely different approaches
(e.g. CF and content-based)
○ You can use many different models at the
ensemble layer: LR, GDBTs, RFs, ANNs...
Ensembles & Feature Engineering
● Ensembles are the way to turn any model into a feature!
● E.g. Don’t know if the way to go is to use Factorization
Machines, Tensor Factorization, or RNNs?
○ Treat each model as a “feature”
○ Feed them into an ensemble
The Master Algorithm?
It definitely is an ensemble!
7.BuildingRecommenderSystemsisalso
aboutFeatureEngineering
Need for feature engineering
In many cases an understanding of the domain will lead to
optimal results.
Feature Engineering
Feature Engineering Example - Quora Answer Ranking
What is a good Quora answer?
• truthful
• reusable
• provides explanation
• well formatted
• ...
Feature Engineering Example - Quora Answer Ranking
How are those dimensions translated
into features?
• Features that relate to the answer
quality itself
• Interaction features
(upvotes/downvotes, clicks,
comments…)
• User features (e.g. expertise in topic)
Feature Engineering
● Properties of a well-behaved
ML feature:
○ Reusable
○ Transformable
○ Interpretable
○ Reliable
8.Whyyoushouldcareabout
answeringquestions
(aboutyourrecsys)
Model debuggability
● Value of a model = value it brings to the product
● Product owners/stakeholders have expectations on
the product
● It is important to answer questions to why did
something fail
● Bridge gap between product design and ML algos
● Model debuggability is so important it can
determine:
○ Particular model to use
○ Features to rely on
○ Implementation of tools
Model debuggability
● E.g. Why am I seeing or not seeing
this on my homepage feed?
9.DataandModelsaregreat.Youknow
what’sevenbetter?
Therightevaluationapproach!
Offline/Online testing process
Executing A/B tests
● Measure differences in metrics across statistically identical
populations that each experience a different algorithm.
● Decisions on the product always data-driven
● Overall Evaluation Criteria (OEC) = member retention
○ Use long-term metrics whenever possible
○ Short-term metrics can be informative and allow faster decisions
■ But, not always aligned with OEC
Offline testing
● Measure model performance,
using (IR) metrics
● Offline performance = indication
to make decisions on follow-up
A/B tests
● A critical (and mostly unsolved)
issue is how offline metrics
correlate with A/B test results.
10.Youdon’tneedtodistributeyour
Recsys
Distributing Recommender Systems
● Most of what people do in practice can fit
into a multi-core machine
○ As long as you use:
■ Smart data sampling
■ Offline schemes
■ Efficient parallel code
● (… but not Deep ANNs)
● Do you care about costs? How about latencies or
system complexity/debuggability?
Matrix Factorization Example
Conclusions
● Recommender Systems are about much more than
just predicting a rating
● Designing a “real-life” recsys means paying
attention to issues such as:
○ Feature engineering
○ Training dataset
○ Metrics
○ Experimentation and AB Testing
○ System scalability
○ ...
● Lots of room for improvement & research
Questions?
Xavier Amatriain (Quora)
xavier@amatriain.net
Deepak Agarwal (LinkedIn)
dagarwal@linkedin.com

More Related Content

What's hot (20)

PDF
Recent Trends in Personalization at Netflix
Förderverein Technische Fakultät
 
PDF
Sequential Decision Making in Recommendations
Jaya Kawale
 
PPTX
Personalized Page Generation for Browsing Recommendations
Justin Basilico
 
PDF
A Multi-Armed Bandit Framework For Recommendations at Netflix
Jaya Kawale
 
PDF
Calibrated Recommendations
Harald Steck
 
PDF
Kdd 2014 Tutorial - the recommender problem revisited
Xavier Amatriain
 
PDF
Deep Learning for Recommender Systems
Yves Raimond
 
PPTX
Learning a Personalized Homepage
Justin Basilico
 
PDF
Artwork Personalization at Netflix
Justin Basilico
 
PDF
Past, Present & Future of Recommender Systems: An Industry Perspective
Justin Basilico
 
PDF
推薦アルゴリズムの今までとこれから
cyberagent
 
PDF
Time, Context and Causality in Recommender Systems
Yves Raimond
 
PDF
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Justin Basilico
 
PPTX
Counterfaual Machine Learning(CFML)のサーベイ
ARISE analytics
 
PDF
L 05 bandit with causality-公開版
Shota Yasui
 
PDF
大規模データ時代に求められる自然言語処理
Preferred Networks
 
PDF
[DL輪読会]Attention is not Explanation (NAACL2019)
Deep Learning JP
 
PDF
Netflix Recommendations - Beyond the 5 Stars
Xavier Amatriain
 
PDF
Gephi Quick Start (Japanese)
あしたのオープンソース研究所  
 
PPTX
Recommendations for Building Machine Learning Software
Justin Basilico
 
Recent Trends in Personalization at Netflix
Förderverein Technische Fakultät
 
Sequential Decision Making in Recommendations
Jaya Kawale
 
Personalized Page Generation for Browsing Recommendations
Justin Basilico
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
Jaya Kawale
 
Calibrated Recommendations
Harald Steck
 
Kdd 2014 Tutorial - the recommender problem revisited
Xavier Amatriain
 
Deep Learning for Recommender Systems
Yves Raimond
 
Learning a Personalized Homepage
Justin Basilico
 
Artwork Personalization at Netflix
Justin Basilico
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Justin Basilico
 
推薦アルゴリズムの今までとこれから
cyberagent
 
Time, Context and Causality in Recommender Systems
Yves Raimond
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Justin Basilico
 
Counterfaual Machine Learning(CFML)のサーベイ
ARISE analytics
 
L 05 bandit with causality-公開版
Shota Yasui
 
大規模データ時代に求められる自然言語処理
Preferred Networks
 
[DL輪読会]Attention is not Explanation (NAACL2019)
Deep Learning JP
 
Netflix Recommendations - Beyond the 5 Stars
Xavier Amatriain
 
Recommendations for Building Machine Learning Software
Justin Basilico
 

Viewers also liked (8)

PDF
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Xavier Amatriain
 
PDF
ML to cure the world
Xavier Amatriain
 
PDF
Recommender Systems In Industry
Xavier Amatriain
 
PDF
Past present and future of Recommender Systems: an Industry Perspective
Xavier Amatriain
 
PDF
Recommender Systems
T212
 
PPT
Recommendation system
Vikrant Arya
 
PDF
Building a Recommendation Engine - An example of a product recommendation engine
NYC Predictive Analytics
 
PDF
Recommender system algorithm and architecture
Liang Xiang
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Xavier Amatriain
 
ML to cure the world
Xavier Amatriain
 
Recommender Systems In Industry
Xavier Amatriain
 
Past present and future of Recommender Systems: an Industry Perspective
Xavier Amatriain
 
Recommender Systems
T212
 
Recommendation system
Vikrant Arya
 
Building a Recommendation Engine - An example of a product recommendation engine
NYC Predictive Analytics
 
Recommender system algorithm and architecture
Liang Xiang
 
Ad

Similar to Recsys 2016 tutorial: Lessons learned from building real-life recommender systems (20)

PDF
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Xavier Amatriain
 
PDF
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
Xavier Amatriain
 
PDF
10 more lessons learned from building Machine Learning systems - MLConf
Xavier Amatriain
 
PDF
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
MLconf
 
PDF
10 more lessons learned from building Machine Learning systems
Xavier Amatriain
 
PDF
Lessons learned from building practical deep learning systems
Xavier Amatriain
 
PDF
Tips for data science competitions
Owen Zhang
 
PDF
10 Lessons Learned from Building Machine Learning Systems
Xavier Amatriain
 
PDF
Kaggle presentation
HJ van Veen
 
PPTX
Lessons learnt at building recommendation services at industry scale
Domonkos Tikk
 
PDF
Choosing a Machine Learning technique to solve your need
GibDevs
 
PDF
Cikm 2013 - Beyond Data From User Information to Business Value
Xavier Amatriain
 
PDF
Making Machine Learning Work in Practice - StampedeCon 2014
StampedeCon
 
PDF
Practical machine learning
Faizan Javed
 
PDF
Winning data science competitions, presented by Owen Zhang
Vivian S. Zhang
 
PPTX
Model Development And Evaluation in ML.pptx
bismayabaliarsingh00
 
PDF
Boston ML - Architecting Recommender Systems
James Kirk
 
PDF
An introduction to Machine Learning
Valéry BERNARD
 
PDF
Machine learning systems for engineers
Cameron Joannidis
 
PDF
Ensemble Methods and Recommender Systems
rosni
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Xavier Amatriain
 
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
Xavier Amatriain
 
10 more lessons learned from building Machine Learning systems - MLConf
Xavier Amatriain
 
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
MLconf
 
10 more lessons learned from building Machine Learning systems
Xavier Amatriain
 
Lessons learned from building practical deep learning systems
Xavier Amatriain
 
Tips for data science competitions
Owen Zhang
 
10 Lessons Learned from Building Machine Learning Systems
Xavier Amatriain
 
Kaggle presentation
HJ van Veen
 
Lessons learnt at building recommendation services at industry scale
Domonkos Tikk
 
Choosing a Machine Learning technique to solve your need
GibDevs
 
Cikm 2013 - Beyond Data From User Information to Business Value
Xavier Amatriain
 
Making Machine Learning Work in Practice - StampedeCon 2014
StampedeCon
 
Practical machine learning
Faizan Javed
 
Winning data science competitions, presented by Owen Zhang
Vivian S. Zhang
 
Model Development And Evaluation in ML.pptx
bismayabaliarsingh00
 
Boston ML - Architecting Recommender Systems
James Kirk
 
An introduction to Machine Learning
Valéry BERNARD
 
Machine learning systems for engineers
Cameron Joannidis
 
Ensemble Methods and Recommender Systems
rosni
 
Ad

More from Xavier Amatriain (19)

PDF
Data/AI driven product development: from video streaming to telehealth
Xavier Amatriain
 
PDF
AI-driven product innovation: from Recommender Systems to COVID-19
Xavier Amatriain
 
PDF
AI for COVID-19 - Q42020 update
Xavier Amatriain
 
PDF
AI for COVID-19: An online virtual care approach
Xavier Amatriain
 
PDF
AI for healthcare: Scaling Access and Quality of Care for Everyone
Xavier Amatriain
 
PDF
Towards online universal quality healthcare through AI
Xavier Amatriain
 
PDF
From one to zero: Going smaller as a growth strategy
Xavier Amatriain
 
PDF
Learning to speak medicine
Xavier Amatriain
 
PDF
Medical advice as a Recommender System
Xavier Amatriain
 
PDF
Staying Shallow & Lean in a Deep Learning World
Xavier Amatriain
 
PDF
Machine Learning for Q&A Sites: The Quora Example
Xavier Amatriain
 
PDF
Past, present, and future of Recommender Systems: an industry perspective
Xavier Amatriain
 
PDF
Barcelona ML Meetup - Lessons Learned
Xavier Amatriain
 
PDF
Machine Learning to Grow the World's Knowledge
Xavier Amatriain
 
PDF
MLConf Seattle 2015 - ML@Quora
Xavier Amatriain
 
PDF
Lean DevOps - Lessons Learned from Innovation-driven Companies
Xavier Amatriain
 
PDF
Recsys 2014 Tutorial - The Recommender Problem Revisited
Xavier Amatriain
 
PDF
MMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the Cloud
Xavier Amatriain
 
PDF
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Xavier Amatriain
 
Data/AI driven product development: from video streaming to telehealth
Xavier Amatriain
 
AI-driven product innovation: from Recommender Systems to COVID-19
Xavier Amatriain
 
AI for COVID-19 - Q42020 update
Xavier Amatriain
 
AI for COVID-19: An online virtual care approach
Xavier Amatriain
 
AI for healthcare: Scaling Access and Quality of Care for Everyone
Xavier Amatriain
 
Towards online universal quality healthcare through AI
Xavier Amatriain
 
From one to zero: Going smaller as a growth strategy
Xavier Amatriain
 
Learning to speak medicine
Xavier Amatriain
 
Medical advice as a Recommender System
Xavier Amatriain
 
Staying Shallow & Lean in a Deep Learning World
Xavier Amatriain
 
Machine Learning for Q&A Sites: The Quora Example
Xavier Amatriain
 
Past, present, and future of Recommender Systems: an industry perspective
Xavier Amatriain
 
Barcelona ML Meetup - Lessons Learned
Xavier Amatriain
 
Machine Learning to Grow the World's Knowledge
Xavier Amatriain
 
MLConf Seattle 2015 - ML@Quora
Xavier Amatriain
 
Lean DevOps - Lessons Learned from Innovation-driven Companies
Xavier Amatriain
 
Recsys 2014 Tutorial - The Recommender Problem Revisited
Xavier Amatriain
 
MMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the Cloud
Xavier Amatriain
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Xavier Amatriain
 

Recently uploaded (20)

PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PPTX
Designing Production-Ready AI Agents
Kunal Rai
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Biography of Daniel Podor.pdf
Daniel Podor
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Designing Production-Ready AI Agents
Kunal Rai
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 

Recsys 2016 tutorial: Lessons learned from building real-life recommender systems