Recommendation for new users at Criteo

Olivier Koch, Criteo
RecSys London Meetup - Nov 8th, 2018
Large-scale
recommendation
for new users

2 •
Joint work with Ivan Lobov, Mohamed Amine
Benhalloum, Dmitry Parfenchik, Alexandre Gillotte, Alois
Bissuel, Vincent Grosbois, Sergei Lebedev, Flavian Vasile

3 •
1. Context
2. Large-scale matrix factorization with randomized SVD
3. Offline evaluation methods
4. What's next?
Outline

4 •
Buy ad space on publishers’ websites.
Build banners showing products that users will like / want to buy.
Get paid if users click / buy the product.
What / Who is Criteo again?

5 •
What / Who is Criteo again?
3 billion ads/day
5 billion products
100 ms

6 •
Retargeting
~ a few hours

7 •
Acquisition
?
~ a few days/weeks

8 •
2B users
20K partners
~1M products/partner
Hundreds of possible campaigns per user
In 50 ms!
At scale

9 •
The Acquisition pipeline
Campaign selection
Product selection
(Recommendation)
Bidding

10 •
Campaign selection
Product selection
(Recommendation)
Bidding

11 •
Campaign selection
Product selection
(Recommendation)
Bidding
The Recommendation problem

12 •
Instead of letting a different model do the
bidding/campaign selection, how about we do
recommendation for all user - partner pairs?
200B recommendations anyone?

14 •
Singular value decomposition
A U S VT
m x n m x m m x n n x n
=

15 •
The catch
m = n = hundred of million items

16 •
Randomized SVD
Trick: Approximate A with a tall-and-tiny matrix Q

18 •
Randomized SVD
How do we find Q?

21 •
Randomized SVD
0
20
40
60
80
100
120
1
5
9
13
17
21
25
29
33
37
41
45
49
53
57
61
65
69
73
77
81
85
89
93
97
101
105
109
113
117
121
125
129
133
137
141
145
149
153
157
161
165
169
173
177
singular values

22 •
Finding structure with randomness: Probabilistic algorithms for constructing
approximate matrix decompositions, Nathan Halko, Per-Gunnar Martinsson, Joel A.
Tropp, Journal SIAM, May 2011
Randomized SVD

23 •
spark-rsvd
https://github.com/criteo/Spark-RSVD

24 •
spark-rsvd (blog post)
https://medium.com/@alois.bissuel/6695b649f519

25 •
Point-wise mutual information

26 •
Approximate nearest neighbors with Annoy
https://erikbern.com/2015/10/01/nearest-neighbors-and-vector-models-part-2-how-to-search-in-high-dimensional-spaces.html
Credits: Erik Bernhardsson

27 •
Putting it all together
User timelines
CoEvent
matrix
PMI
matrix
R-SVD
KNN
Indexing
KNN Indices
training
inference
User
embedding
Product
vectors
KNN SearchUser timelines Recommend
ations

28 •
memcacheRecommen-
dations
HDFS
All users x partners
RecoService
Campaign
selection
users x ~50 partners

29 •
memcacheRecommendati
ons
HDFS
All users x partners
RecoService
Campaign
selection
users x ~50 partners
Simpler
(« no model »)
Evolutive
(reco-based)

30 •
Offline pipeline runs at scale in 5-10 hours with 100 Spark
executors on ~300M timelines
Spark, scala, python
Scheduled every day
The best is the enemy of the good (good enough for an AB test)

31 •
Good vs Best trade-off
Not scalable
Not prod-grade
A few weeks
Scalable
Prod-grade
Many months
Scalable
Not-quite-prod-grade
Several months

33 •
• Global best-of (per partner)
• Mixture of « sources » (best-of-by-X) merged into a pClick
model
Baselines

34 •
Precision @ k over pairs of partners
Offline metrics
train validation

40 •
Fusing CF and metadata (content2vec)
Deeper representations of users and products (graph
convolutions, recurrent neural nets)
Train at scale with TF

41 •
tf-yarn: train TensorFlow models on YARN in just a few lines of code!
https://github.com/criteo/tf-yarn

42 •
Acquisition provides new challenges for Recommendation algorithms
MF (via R-SVD) is an attractive approach to try
We built a pipeline leveraging R-SVD and KNN at scale (~300M users, hundreds of
partners) with promising offline results
Qualitative evaluation matters (on top of the quantitative one)
There are many things coming up next!
Summary

43 •
Thank you!
o.koch@criteo.com
ailab.criteo.com

Recommendation for new users at Criteo

More Related Content

Similar to Recommendation for new users at Criteo (20)

Recently uploaded (20)

Recommendation for new users at Criteo