DATA PRODUCTS: 5 DEADLY SINS AND
HOW TO PREVENT THEM
Pride Wrath Lust Gluttony Sloth
Mathieu Bastian

Web Summit 2015, Dublin
Credits:The Seven Deadly Sins, Nanatsu noTaizai & nimbus-mage.deviantart.com
ABOUT ME
• Data scientist & engineer
• Led data products team at LinkedIn
• Gephi co-founder
• Open-source contributor
2
DATA PRODUCTS
Source: http://bit.ly/1kMUPAe.
Tentative definition
User-facing production system
based on an automated learning
algorithm
3
DATA PRODUCTS
TODAY
4
PRIDE
"Excessive belief in one’s own abilities or
excessive love of oneself"
5
PRIDE
Source: http://www.themeasurementstandard.com/wp-content/uploads/2015/06/data-scientist-as-superman.jpg
6
With power comes responsibility
7
Source: http://www.economist.com/node/15579717
Who are you building it for?
Understand user intent
Integrate into the user flow
Explain recommendations to the user
Set right user expectations
Treat user like you would like to be treated
8
Credits: Google
Anticipate edge cases
9
WRATH
“Choice of violent and hateful actions
over love and patience"
10
WRATH
11
Exercise perseverance
Reward
Time
Phase II:
Growth
Phase III:
Maintenance
Phase I:
Inception
12
But have a plan
13
LUST
"Depraved thought, unwholesome
morality and desire for excitement"
14
LUST
Credits: Google Data Center
15
Perform due diligence
16
Thank the janitor & handyman
17
GLUTTONY
"The consumption of more of anything
than you need"
18
GLUTTONY
19
Avoid solo data scientists
20
Credits: Lucasfilm
Choose the right problem
M - Measurable
E - Explainable
R - Rapid prototyping
C - Core
I - Iterable
21
SLOTH
"Not caring about others or living life in a
fulfilling way"
22
SLOTH
23
Embrace continuous data pipelines
Source: http://http://azkaban.github.io/
24
Make data pipelines robust
Code
Upload
Run
workflow
Look at
logs
Code Upload
Run
workflow
PigUnit
25
THANK YOU!
Mathieu Bastian
@mathieubastian
www.linkedin.com/in/mathieubastian

Data Products: 5 Deadly Sins and How To Prevent Them