Deep Learning-based Recommendations for
Germany’s Biggest Vehicle Marketplace
Data Science Meetup, Düsseldorf, March 12th, 2018
Florian Wilhelm
2
Special Interests
• Mathematical Modelling
• Recommendation Systems
• Data Science in Production
• Python Data Stack
Dr. Florian Wilhelm
Data Scientist @ inovex
@FlorianWilhelm
FlorianWilhelm
florianwilhelm.info
Marcel Kurovski
Data Scientist
3
Outline
§Introduction
§Use-cases
§Traditional
§Deep Learning
§Scalability & Production
4
MOBILE.DE
GERMAN MARKET
LEADER
13.5 MIO
UNIQUE USER
PER MONTH
1.6 MIO
VEHICLES
290
EMPLOYEES
DREILINDEN /
FRIEDRICHSHAIN
BERLIN
HEADQUARTERS
Part of
ebay Tech
5
IT-project house for digital transformation:
‣ Agile Development & Management
‣ Web · UI/UX · Replatforming · Microservices
‣ Mobile · Apps · Smart Devices · Robotics
‣ Big Data & Business Intelligence Platforms
‣ Data Science · Data Products · Search · Deep Learning
‣ Data Center Automation · DevOps · Cloud · Hosting
‣ Trainings & Coachings
Using technology to inspire our
clients. And ourselves.
inovex offices in
Karlsruhe · Pforzheim · Köln
München · Hamburg · Stuttgart.
www.inovex.de
6
Outline
§Introduction
§Use-cases
§Traditional
§Deep Learning
§Scalability & Production
7
Why Recommendations?Why Recommendations?
Show width of offering
Inspiration
Engagement
8
- engagement
- inspiration
- relevance
Why Recommendations?
- high click-through-rate
- small exit- & bounce-rates
User
Benefits
Business
Benefits
9
X
X
X
X
Mobile.de Conversion Funnel
WishlistHome Search
Result Page
View Contact Buy
X
10
Recommendations on Home
Home
Recommendations based on
preferences of visiting users as an
alternative entry point.
WishlistHome SRP View Contact Buy
11
Recommendations on View Item Page
VIP
Recommendations based on the
specific make and model a user is
viewing to present alternatives
WishlistHome SRP View Contact Buy
12
Recommendations on your Wishlist
Recommendations based on the
specific make and model of a
deleted ad to provide almost
identical recommendations
Recommendations based on the
users car preferences and the
parking lot items.
WishlistHome SRP View Contact Buy
13
Information Explosion
14
Annual Data Sphere increases exponentially
International Data Corporation: Data Age 2025 study, April 2017
Information Load
à Humans
Human Processing
Capacity
15
Information and Choice Overload?
https://www.linkedin.com/pulse/its-information-overload-filter-failure-productivity-industry-zayats/
“It‘s not information overload. It‘s filter failure."
- Clay Shirky
https://en.wikipedia.org/wiki/Clay_Shirky
16
§Introduction
§Use-cases
§Traditional
§Deep Learning
§Scalability & Production
17
Collaborative Filtering
favoured or viewed
rated highly
also favoured or viewed
recommend
items similar
18
Watched / Rated
Unwatched
Item-Item Similarity
??
?
?
????
Item-based Recommendations
Cosine Similarity
19
Recommendation
Item-based Recommendations
P
P
P
P
P P
Wishlist
20
Sparsity
21
Non-negative Matrix Factorisation (NMF)
LF1 LF2
LF1
LF2
M (|U| x |I|) x R (|LF| x |I|)
= X
= L (|U| x |LF|)
22
Summary of Collaborative Filtering
üCollective behaviour of users
üStandard-Method (it works, it’s reliable etc.)
x Cold Start Problem: New listings need a
certain number of clicks to be recommended.
x Sparsity problems: lot fewer interaction
data points than total items and users.
x Content agnostic
x Only “batch-based” learning
23
Looking For: Used Car (100%)
Prefers (Make): BMW (50%), Audi (50%)
Prefers (Model): Audi A3 (25%), Audi A4 (25%),
BMW 318 (50%)
Searching In: lat 52.5206, lon 13.409
Search Radius: 300km
Preferred Price: 20 000€ ± 1500€
Preferred Mileage: 10 000km ± 5000km
User Preferences
Anonymous
Content-based Filtering: User Preferences
24
Content-based Filtering
interacted
<Price: 10K, Category: small>
<Price: 6K, Category: small>
<Price: 90K, Category: sports>
less similar
<Price: 10K, Category: small>
similar
recommend
25
Cold Start
§ http://www.yusp.com/wp-content/uploads/2015/07/cold-start-problem-recommender-systems-1.jpg
26
Posterior probability∝Likelihood×Prior probability
Bayes‘ Theorem – Representing uncertainty
Bayesian treatement allows handling uncertainty for users with a low/no event
count (cold start problem)
This allows us to:
§ use an empirical prior („average“ user) for unknown users
§ update the user's profile as new interactions arrive
§ quantify the uncertainty of the user's preferences
https://en.wikipedia.org
27
Bayes’ Theorem and User Profiles
Number of
user eventsImpact of prior
User Interactions
+
à
Posterior User Preferences
28
Summary of Content-based Filtering
üWorks even if there are no other
users
ücontent-based preferences of
users based on a weighted vector
of item features
xHard to do recommendations for
new users (cold start problem)
xNon-applicable for heterogenous
content types
xLow diversity, i.e. more of the same
29
Traditional Hybrid Recommender
Collaborative
Filtering
Hybrid
Recommender
Content
based
PP
P P
P
Looking For: Used Car (100%)
Prefers (Make): BMW (50%), Audi (50%)
Prefers (Model): Audi A3 (25%), Audi A4 (25%),
BMW 318 (50%)
Searching In: lat 52.5206, lon 13.409
Search Radius: 300km
Preferred Price: 20 000€ ± 1500€
Preferred Mileage: 10 000km ± 5000km
User Profile
Buyer
Last Action: Yesterday
Frequent User
User 12345
Likelihood to buy: 88 %
Elastic Search Query
ü based on ES and Mahout
ü comprehensible and debuggable
ü robust and reliable concepts
ü easy to tune for different use-cases
x incapable of capturing inherent non-
linear feature dependencies
x lots of manual feature engineering
30
Alternative Hybrid Recommender Approach
? ? ? ?
!
"
#!
!
! !
" #!
!
$ $ $ $ $ $
$
$
$
$
$
model
color
mileageage
gender
income
31
Make/Model Recommender with LightFM
LightFM:
§Matured and well documented Python package
§Optimized and parallelized with Cython
§Hybrid recommender based on matrix factorisation
§Supports Learning-to-Rank objectives (BPR, WARP)
32
§Introduction
§Use-cases
§Traditional
§Deep Learning
§Scalability & Production
33
Deep Learning
„[...] reported a 29%
sales increase to
$12.83 billion [...]“
Deep Learning Success StoriesReasons for Deep Learning
§ captures nonlinear relations
§ holistic approach
§ less feature engineering
§ improved quality
Search
Recommendations
34
Find the car that perfectly fits your life
User’s Car Preferences Car Pool + Attributes
(make, model, color, price, …)
Flexible
(cold-start, uncertainty, real-time, ...)
Interactions of other users
(views, favourits, contacts)
35
Recommendation Task
Deep
Black Box
Mobile.de Listings
1:
2:
4:
3:
5:
Personalized
Ranking
„Which car would the user most likely consider next?“
36
Deep Learning Network
mileage
price
color
history
mileage
price
color
views
...
0.38
0.25
0.79
...
0
1
0
...
0
0.2
0.8
...
...
scalingencodescalingencode
...
0.35
-0.15
2.03
cont.
cat.
cont.
cat.
Output
Probability that user
likes vehicle
userembeddingsitemembeddings deep component
RankNet
Item
Net
User
Net
37
Approach: Classifier Training
RankNet
eu
u
UserNet
ei
i
ItemNet
p ( i | u )
class_loss
sim_loss
Adam
Optimizer
Adam Optimizer: Stochastic Gradient Descent with adaptive learning rate and adaptive momentum
Adam
Optimizerminimize
minimize
38
Approach: Cost Functions
1
2
sim_loss
39
Idea behind Embedding Similarity Regularization
x1
x2
x3
x1
x2
u
i euei
embedding
×
⍺
⍺
40
Training Data
Time-based
Train/Test-Split
Users & Items
CW 14
April `17
CW 15
CW 16
CW 17
CW 18
May `17
Train Test
§ Weekly User Preferences
§ Lookback window 30d
§ Min. 2 events
§ Result: 85/15 - split
§ Subscribed Users
§ Sample Size = 100k
Sub User Events
§ View
§ Parking
§ Contact
Continuous (6) and Categorical (15) Features
§ Consumption
§ First Registration
§ Mileage
§ Price
§ Latitude
§ Longitude
§ Airbag
§ Category
§ Climatisation
§ Color
§ Condition
§ Country
§ Doors
§ Fuel
§ Make-ID
§ Model-ID
§ Previous Owners
§ Seats
§ Site-ID
§ Subcategory
§ Transmission
41
Data Sparsity
MovieLens 1M: 4.2600% MovieLens 20M: 0.5300%
Last.fm: 0.2800% Vehicles All: 0.0046%
~8M interactions b/w 100k users and 1.7M items
42
Preprocessing
Technical Preprocessing Content-related Preprocessing
§ Negative Sampling within the
same vehicle subcategory
§ Binary Target Labels: {0, 1}
§ Class Ratio: 1:1
§ Outlier Removal based on
continuous features
§ User Profile Feature Conversion
§ Data fetch
§ Type conversions
§ User and Item ID contiguous
§ User Profile Generation
§ Reduction to sample size
§ m = 100,000
43
Improvements by Deep Learning
0,25%
0,35%
0,45%
0,55%
0,65%
0,75%
0,85%
0,95%
1,05%
1,15%
k = 1 k = 5 k = 10 k = 30 k = 100
MAP@k
Collaborative Filtering
Traditional Hybrid
Deep Recommender
+73%
+143%
44
§Introduction
§Use-cases
§Traditional
§Deep Learning
§Scalability & Production
45
Information Retrieval Dichotomy
Covington, Paul, Jay Adams, and Emre Sargin: Deep neural networks for youtube recommendations (2016)
46
ANN for Candidate Generation
https://erikbern.com/2016/06/02/approximate-nearest-news.html
47
Deep Learning Recommender - Architecture
ad storage
embeddings
RankNet
UserNet
ItemNet
Candidate
Generation
ANN Index
Candidate ServiceRanking Service
Web Service
User Preference API
Recommendation Service
48
Technology Stack
Annoy ANN by
Spotify
Hardware
GPU-Server
NVIDIA Tesla K80
4x Intel Xeon 3.5 GHz
64GB RAM,
850GB Disk
LightFM
by Lyst
49
Production: TensorFlow Serving and Integration
https://www.tensorflow.org/serving/architecture_overview
50
Conclusion
Deep
Learning
Flexibility
HybridAbstraction
51
Recommendations support
users to find the perfect vehicle
based on their preferences and
similarities to other users
WE ARE HIRING!
52
53
Thank you!
Florian Wilhelm
Data Scientist
inovex GmbH
Schanzenstraße 6-20
Kupferhütte 1.13
51063 Köln
florian.wilhelm@inovex.de

More Related Content

PDF
Representation learning on graphs
PDF
Automating materials science workflows with pymatgen, FireWorks, and atomate
PPTX
Teleportation
PPTX
Introduction to PyTorch
PDF
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
PDF
Which car fits my life? Mobile.de’s approach to recommendations
PDF
Which car fits my life? - PyData Berlin 2017
PDF
Deep Learning for Recommender Systems
Representation learning on graphs
Automating materials science workflows with pymatgen, FireWorks, and atomate
Teleportation
Introduction to PyTorch
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Which car fits my life? Mobile.de’s approach to recommendations
Which car fits my life? - PyData Berlin 2017
Deep Learning for Recommender Systems

Similar to Deep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace (20)

PDF
Deep Learning for Recommender Systems
PDF
Car Recommendation System Using Customer Reviews
PDF
Andrew Clegg, Data Scientician & Machine Learning Engine-Driver: "Deep produc...
PDF
Crafting Recommenders: the Shallow and the Deep of it!
PDF
Recsys 2016
PDF
Recommender Systems @ Scale, Big Data Europe Conference 2019
PPTX
Talk@rmit 09112017
PDF
Further enhancements of recommender systems using deep learning
PDF
Introduction to Recommender Systems
PDF
Frequently Bought Together Recommendations Based on Embeddings
PDF
Embeddings! embeddings everywhere!
PDF
Recommender Systems @ Scale - PyData 2019
PDF
Deep Recommender Systems - PAPIs.io LATAM 2018
PDF
Deep neural network for youtube recommendations
PDF
Recommendation engines matching items to users
PDF
Recommendation engines : Matching items to users
PPTX
Deep Learning for Recommender Systems
PDF
Deep neural networks for Youtube recommendations
PDF
Master Minds on Data Science - Maarten de Rijke
PDF
Past, Present & Future of Recommender Systems: An Industry Perspective
Deep Learning for Recommender Systems
Car Recommendation System Using Customer Reviews
Andrew Clegg, Data Scientician & Machine Learning Engine-Driver: "Deep produc...
Crafting Recommenders: the Shallow and the Deep of it!
Recsys 2016
Recommender Systems @ Scale, Big Data Europe Conference 2019
Talk@rmit 09112017
Further enhancements of recommender systems using deep learning
Introduction to Recommender Systems
Frequently Bought Together Recommendations Based on Embeddings
Embeddings! embeddings everywhere!
Recommender Systems @ Scale - PyData 2019
Deep Recommender Systems - PAPIs.io LATAM 2018
Deep neural network for youtube recommendations
Recommendation engines matching items to users
Recommendation engines : Matching items to users
Deep Learning for Recommender Systems
Deep neural networks for Youtube recommendations
Master Minds on Data Science - Maarten de Rijke
Past, Present & Future of Recommender Systems: An Industry Perspective
Ad

More from Florian Wilhelm (16)

PDF
Why Exceptions are just sophisticated GoTos ... and How to Move Beyond
PDF
Vodafone Mathematical Modelling 2024.pdf
PDF
Streamlining Python Development: A Guide to a Modern Project Setup
PDF
Unlocking the Power of Integer Programming
PDF
WALD: A Modern & Sustainable Analytics Stack
PDF
Forget about AI and do Mathematical Modelling instead!
PDF
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
PDF
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
PDF
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
PDF
Uncertainty Quantification in AI
PDF
Performance evaluation of GANs in a semisupervised OCR use case
PDF
Bridging the Gap: from Data Science to Production
PDF
How mobile.de brings Data Science to Production for a Personalized Web Experi...
PDF
Declarative Thinking and Programming
PDF
PyData Meetup Berlin 2017-04-19
PDF
Explaining the idea behind automatic relevance determination and bayesian int...
Why Exceptions are just sophisticated GoTos ... and How to Move Beyond
Vodafone Mathematical Modelling 2024.pdf
Streamlining Python Development: A Guide to a Modern Project Setup
Unlocking the Power of Integer Programming
WALD: A Modern & Sustainable Analytics Stack
Forget about AI and do Mathematical Modelling instead!
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
Uncertainty Quantification in AI
Performance evaluation of GANs in a semisupervised OCR use case
Bridging the Gap: from Data Science to Production
How mobile.de brings Data Science to Production for a Personalized Web Experi...
Declarative Thinking and Programming
PyData Meetup Berlin 2017-04-19
Explaining the idea behind automatic relevance determination and bayesian int...
Ad

Recently uploaded (20)

PDF
toaz.info-grade-11-2nd-quarter-earth-and-life-science-pr_5360bfd5a497b75f7ae4...
PPTX
lung disease detection using transfer learning approach.pptx
PPTX
1.Introduction to orthodonti hhhgghhcs.pptx
PPTX
langchainpptforbeginners_easy_explanation.pptx
PPT
What is life? We never know the answer exactly
PDF
General category merit rank list for neet pg
PPTX
9 Bioterrorism.pptxnsbhsjdgdhdvkdbebrkndbd
PDF
newhireacademy couselaunchedwith pri.pdf
PPTX
DAA UNIT 1 for unit 1 time compixity PPT.pptx
PDF
book-34714 (2).pdfhjkkljgfdssawtjiiiiiujj
PDF
Delhi c@ll girl# cute girls in delhi with travel girls in delhi call now
PPTX
inbound6529290805104538764.pptxmmmmmmmmm
PPTX
AI-Augmented Business Process Management Systems
PPTX
GPS sensor used agriculture land for automation
PPTX
cardiac failure and associated notes.pptx
PPTX
Chapter security of computer_8_v8.1.pptx
PPTX
Sheep Seg. Marketing Plan_C2 2025 (1).pptx
PDF
Introduction to Database Systems Lec # 1
PPTX
Basic Statistical Analysis for experimental data.pptx
PDF
9 FinOps Tools That Simplify Cloud Cost Reporting.pdf
toaz.info-grade-11-2nd-quarter-earth-and-life-science-pr_5360bfd5a497b75f7ae4...
lung disease detection using transfer learning approach.pptx
1.Introduction to orthodonti hhhgghhcs.pptx
langchainpptforbeginners_easy_explanation.pptx
What is life? We never know the answer exactly
General category merit rank list for neet pg
9 Bioterrorism.pptxnsbhsjdgdhdvkdbebrkndbd
newhireacademy couselaunchedwith pri.pdf
DAA UNIT 1 for unit 1 time compixity PPT.pptx
book-34714 (2).pdfhjkkljgfdssawtjiiiiiujj
Delhi c@ll girl# cute girls in delhi with travel girls in delhi call now
inbound6529290805104538764.pptxmmmmmmmmm
AI-Augmented Business Process Management Systems
GPS sensor used agriculture land for automation
cardiac failure and associated notes.pptx
Chapter security of computer_8_v8.1.pptx
Sheep Seg. Marketing Plan_C2 2025 (1).pptx
Introduction to Database Systems Lec # 1
Basic Statistical Analysis for experimental data.pptx
9 FinOps Tools That Simplify Cloud Cost Reporting.pdf

Deep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace