SlideShare a Scribd company logo
Building Personalized
Data Products with Dato
Trey Causey
trey@dato.com
Questions?
• Now: We are monitoring chat window
• Later: Email me at trey@dato.com
• dato.com
What are data products?
• Products that produce and consume data.
• Products that improve as they produce and
consume data.
• Products that use data to provide a personalized
experience.
• Personalized experiences increase engagement
and retention.
What data?
• You probably already have this data
• Usage logs, transaction data, etc.
• Need a way to turn this existing data into
an intelligent application
Recommender systems
• Personalized experiences through
recommendations
• Recommend products, social network
connections, events, songs, and more
• Implicitly and explicitly drive many of
experiences you’re familiar with
Recommender uses
• Netflix, Spotify, LinkedIn, Facebook with the most
visible examples
• “You May Also Like”
“People You May Know”
“People to Follow”
• Also silently power many other experiences
• Product listings, up-sell options, add-ons,
• Netflix —> $1MM for 10% better
What data do you need?
• Required for implicit data
• User identifier
• Product identifier
• That’s it!
• Further customization
• Ratings (explicit data), counts
• Side data
Implicit data
• User x product
interactions
• Consumed / used /
clicked / etc.
How do recommenders work?
• Most basic: item similarity
Matrix factorization
• Treat users and products as a giant matrix
with (very) many missing values
• Users have latent factors that describe
how much they like various genres
• Items have latent factors that describe
how much like each genre they are
Matrix factorization
• Turn this into a fill-in-the-missing-value
exercise by learning the latent factors
• Implicit or explicit data
• Part of the winning formula for the Netflix
Prize
• Predict ratings or rankings
Matrix factorization
Building Personalized Data Products with Dato
Building Personalized Data Products with Dato
Building Personalized Data Products with Dato
Building Personalized Data Products with Dato
Fill in the blanks
• Learn the latent factors that minimize
prediction error on the observed values
• Fill in the missing values
• Sort the list by predicted rating &
recommend the unseen items
Rankings?
• Often less concerned with predicting
precise scores
• Just want to get the first few items right
• Screen real estate is precious
• Ranking factorization recommender
Side features
• Include information about users
• Geographic, demographic, time of day,
etc.
• Include information about products
• Product subtypes, geographic
availability, etc.
• Help with the cold start problem
How to choose which model?
• Select the appropriate model for your data
(implicit/explicit), if you want side features
or not, select hyperparameters, tune
them…
• … or let GraphLab Create do it for you and
automatically tune hyperparameters
Evaluation
• Train on a portion of your data
• Test on a held-out portion
• Ratings: RMSE
• Ranking: Precision, recall
• Business metrics
• Evaluate against popularity
Live demo
• Building and deploying a recommender
system with GraphLab Create and Dato
Predictive Services
Thank you!
• dato.com
• @datoinc
• trey@dato.com

More Related Content

PDF
Introduction to Recommender Systems
Turi, Inc.
 
PPTX
Getting Started With Dato - August 2015
Turi, Inc.
 
PPTX
Webinar - Product Matching - Palombo (20160428)
Turi, Inc.
 
PPTX
Webinar - Know Your Customer - Arya (20160526)
Turi, Inc.
 
PPTX
Machine Learning with GraphLab Create
Turi, Inc.
 
PPTX
Towards a Comprehensive Machine Learning Benchmark
Turi, Inc.
 
PDF
Data Workflows for Machine Learning - Seattle DAML
Paco Nathan
 
PPTX
Production machine learning_infrastructure
joshwills
 
Introduction to Recommender Systems
Turi, Inc.
 
Getting Started With Dato - August 2015
Turi, Inc.
 
Webinar - Product Matching - Palombo (20160428)
Turi, Inc.
 
Webinar - Know Your Customer - Arya (20160526)
Turi, Inc.
 
Machine Learning with GraphLab Create
Turi, Inc.
 
Towards a Comprehensive Machine Learning Benchmark
Turi, Inc.
 
Data Workflows for Machine Learning - Seattle DAML
Paco Nathan
 
Production machine learning_infrastructure
joshwills
 

What's hot (20)

PDF
Modern Machine Learning Infrastructure and Practices
Will Gardella
 
PDF
The Machine Learning Workflow with Azure
Ivo Andreev
 
PDF
Knowledge Discovery
AndrĂŠ KarpiĹĄtĹĄenko
 
PPTX
Webinar - Fraud Detection - Palombo (20160428)
Turi, Inc.
 
PDF
201906 02 Introduction to AutoML with ML.NET 1.0
Mark Tabladillo
 
PDF
Data! Data! Data! I Can't Make Bricks Without Clay!
Turi, Inc.
 
PPTX
Danny Bickson - Python based predictive analytics with GraphLab Create
PyData
 
PDF
Architecting for Data Science
Johann Schleier-Smith
 
PDF
Square's Machine Learning Infrastructure and Applications - Rong Yan
Hakka Labs
 
PDF
Machine Learning system architecture – Microsoft Translator, a Case Study : ...
Vishal Chowdhary
 
PPTX
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Sri Ambati
 
PDF
Dataiku productive application to production - pap is may 2015
Dataiku
 
PPTX
A quick overview of Eaagle
Eaagle
 
PDF
Building Better Models Faster Using Active Learning
CrowdFlower
 
PDF
Deploying ml
Turi, Inc.
 
PDF
Intro to Machine Learning with H2O and AWS
Sri Ambati
 
PPTX
Introduction to Azure machine learning
Jasjit Chopra
 
PPTX
Emerging trends in Artificial intelligence - A deeper review
Gopi Krishna Nuti
 
PPTX
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
Sri Ambati
 
PDF
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
BigDataCloud
 
Modern Machine Learning Infrastructure and Practices
Will Gardella
 
The Machine Learning Workflow with Azure
Ivo Andreev
 
Knowledge Discovery
AndrĂŠ KarpiĹĄtĹĄenko
 
Webinar - Fraud Detection - Palombo (20160428)
Turi, Inc.
 
201906 02 Introduction to AutoML with ML.NET 1.0
Mark Tabladillo
 
Data! Data! Data! I Can't Make Bricks Without Clay!
Turi, Inc.
 
Danny Bickson - Python based predictive analytics with GraphLab Create
PyData
 
Architecting for Data Science
Johann Schleier-Smith
 
Square's Machine Learning Infrastructure and Applications - Rong Yan
Hakka Labs
 
Machine Learning system architecture – Microsoft Translator, a Case Study : ...
Vishal Chowdhary
 
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Sri Ambati
 
Dataiku productive application to production - pap is may 2015
Dataiku
 
A quick overview of Eaagle
Eaagle
 
Building Better Models Faster Using Active Learning
CrowdFlower
 
Deploying ml
Turi, Inc.
 
Intro to Machine Learning with H2O and AWS
Sri Ambati
 
Introduction to Azure machine learning
Jasjit Chopra
 
Emerging trends in Artificial intelligence - A deeper review
Gopi Krishna Nuti
 
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
Sri Ambati
 
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
BigDataCloud
 
Ad

Viewers also liked (9)

PPTX
Leveraging data science to keep commerce safe
Turi, Inc.
 
PDF
RESUME
Sarah Jane Par
 
PDF
Los Planes de Movilidad Urbana Sostenible: herramientas y alternativas para e...
Ecologistas en Accion
 
PPTX
Dieta de lĂ­quidos completos
Bryan Gonzalez
 
PDF
Neck pain
Robert Briggs
 
PPTX
World health organization - Nikhil - HRM, Welingkar
Nikhil Wayngankar
 
PDF
Lobbysheets 2017
Jan Henk Verburg
 
PDF
Hammerdesk Profile
HammerDesk.Com
 
Leveraging data science to keep commerce safe
Turi, Inc.
 
RESUME
Sarah Jane Par
 
Los Planes de Movilidad Urbana Sostenible: herramientas y alternativas para e...
Ecologistas en Accion
 
Dieta de lĂ­quidos completos
Bryan Gonzalez
 
Neck pain
Robert Briggs
 
World health organization - Nikhil - HRM, Welingkar
Nikhil Wayngankar
 
Lobbysheets 2017
Jan Henk Verburg
 
Hammerdesk Profile
HammerDesk.Com
 
Ad

Similar to Building Personalized Data Products with Dato (20)

PDF
Netflix Recommendations - Beyond the 5 Stars
Xavier Amatriain
 
PDF
Data Science Popup Austin: Predicting Customer Behavior & Enhancing Customer ...
Domino Data Lab
 
PDF
Building Large-scale Real-world Recommender Systems - Recsys2012 tutorial
Xavier Amatriain
 
PDF
Turning Data into Dollars
Zach Gemignani
 
PDF
Data driven products
Ana Brambilla
 
PPTX
Intelligent Applications with Machine Learning Toolkits
Turi, Inc.
 
PDF
DN 2017 | Data product discovery - The product perspective on digital transfo...
Dataconomy Media
 
PDF
Real-world News Recommender Systems
kib_83
 
PDF
Data-driven companies
Steven Moore
 
PPTX
acmsigtalkshare-121023190142-phpapp01.pptx
dongchangim30
 
PDF
IntroductionRecommenderSystems_Petroni.pdf
AlphaIssaghaDiallo
 
PDF
Further enhancements of recommender systems using deep learning
Institute of Contemporary Sciences
 
PDF
Product Recommendation System​ By Using Collaborative Filtering and Network B...
NathanonKaewsamertha
 
DOCX
Mining Large Streams of User Data for PersonalizedRecommenda.docx
ARIV4
 
PDF
Spil games konrad
BigDataExpo
 
PPTX
Rokach-GomaxSlides (1).pptx
Jadna Almeida
 
PPTX
Rokach-GomaxSlides.pptx
Jadna Almeida
 
PDF
Cikm 2013 - Beyond Data From User Information to Business Value
Xavier Amatriain
 
PDF
Ronny lempelyahooindiabigthinkerapril2013
Muthusamy Chelliah
 
PPTX
Recommender systems for E-commerce
Alexander Konduforov
 
Netflix Recommendations - Beyond the 5 Stars
Xavier Amatriain
 
Data Science Popup Austin: Predicting Customer Behavior & Enhancing Customer ...
Domino Data Lab
 
Building Large-scale Real-world Recommender Systems - Recsys2012 tutorial
Xavier Amatriain
 
Turning Data into Dollars
Zach Gemignani
 
Data driven products
Ana Brambilla
 
Intelligent Applications with Machine Learning Toolkits
Turi, Inc.
 
DN 2017 | Data product discovery - The product perspective on digital transfo...
Dataconomy Media
 
Real-world News Recommender Systems
kib_83
 
Data-driven companies
Steven Moore
 
acmsigtalkshare-121023190142-phpapp01.pptx
dongchangim30
 
IntroductionRecommenderSystems_Petroni.pdf
AlphaIssaghaDiallo
 
Further enhancements of recommender systems using deep learning
Institute of Contemporary Sciences
 
Product Recommendation System​ By Using Collaborative Filtering and Network B...
NathanonKaewsamertha
 
Mining Large Streams of User Data for PersonalizedRecommenda.docx
ARIV4
 
Spil games konrad
BigDataExpo
 
Rokach-GomaxSlides (1).pptx
Jadna Almeida
 
Rokach-GomaxSlides.pptx
Jadna Almeida
 
Cikm 2013 - Beyond Data From User Information to Business Value
Xavier Amatriain
 
Ronny lempelyahooindiabigthinkerapril2013
Muthusamy Chelliah
 
Recommender systems for E-commerce
Alexander Konduforov
 

More from Turi, Inc. (20)

PPTX
Webinar - Analyzing Video
Turi, Inc.
 
PDF
Webinar - Patient Readmission Risk
Turi, Inc.
 
PPTX
Webinar - Pattern Mining Log Data - Vega (20160426)
Turi, Inc.
 
PPTX
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Turi, Inc.
 
PDF
Pattern Mining: Extracting Value from Log Data
Turi, Inc.
 
PPTX
Text Analysis with Machine Learning
Turi, Inc.
 
PPTX
Machine Learning in Production with Dato Predictive Services
Turi, Inc.
 
PPTX
Machine Learning in 2016: Live Q&A with Carlos Guestrin
Turi, Inc.
 
PDF
Scalable data structures for data science
Turi, Inc.
 
PPTX
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Turi, Inc.
 
PDF
Machine learning in production
Turi, Inc.
 
PPTX
Overview of Machine Learning and Feature Engineering
Turi, Inc.
 
PPTX
SFrame
Turi, Inc.
 
PDF
Dato Keynote
Turi, Inc.
 
PDF
New Capabilities in the PyData Ecosystem
Turi, Inc.
 
PPTX
Anomaly Detection Using Isolation Forests
Turi, Inc.
 
PPTX
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Turi, Inc.
 
PDF
Pandas & Cloudera: Scaling the Python Data Experience
Turi, Inc.
 
PDF
Better {ML} Together: GraphLab Create + Spark
Turi, Inc.
 
PDF
Deep Learning in a Dumpster
Turi, Inc.
 
Webinar - Analyzing Video
Turi, Inc.
 
Webinar - Patient Readmission Risk
Turi, Inc.
 
Webinar - Pattern Mining Log Data - Vega (20160426)
Turi, Inc.
 
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Turi, Inc.
 
Pattern Mining: Extracting Value from Log Data
Turi, Inc.
 
Text Analysis with Machine Learning
Turi, Inc.
 
Machine Learning in Production with Dato Predictive Services
Turi, Inc.
 
Machine Learning in 2016: Live Q&A with Carlos Guestrin
Turi, Inc.
 
Scalable data structures for data science
Turi, Inc.
 
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Turi, Inc.
 
Machine learning in production
Turi, Inc.
 
Overview of Machine Learning and Feature Engineering
Turi, Inc.
 
SFrame
Turi, Inc.
 
Dato Keynote
Turi, Inc.
 
New Capabilities in the PyData Ecosystem
Turi, Inc.
 
Anomaly Detection Using Isolation Forests
Turi, Inc.
 
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Turi, Inc.
 
Pandas & Cloudera: Scaling the Python Data Experience
Turi, Inc.
 
Better {ML} Together: GraphLab Create + Spark
Turi, Inc.
 
Deep Learning in a Dumpster
Turi, Inc.
 

Recently uploaded (20)

PDF
Software Development Methodologies in 2025
KodekX
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PPTX
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PDF
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
PDF
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
Doc9.....................................
SofiaCollazos
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PDF
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
Software Development Methodologies in 2025
KodekX
 
The Future of Artificial Intelligence (AI)
Mukul
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
Doc9.....................................
SofiaCollazos
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 

Building Personalized Data Products with Dato

  • 2. Questions? • Now: We are monitoring chat window • Later: Email me at [email protected] • dato.com
  • 3. What are data products? • Products that produce and consume data. • Products that improve as they produce and consume data. • Products that use data to provide a personalized experience. • Personalized experiences increase engagement and retention.
  • 4. What data? • You probably already have this data • Usage logs, transaction data, etc. • Need a way to turn this existing data into an intelligent application
  • 5. Recommender systems • Personalized experiences through recommendations • Recommend products, social network connections, events, songs, and more • Implicitly and explicitly drive many of experiences you’re familiar with
  • 6. Recommender uses • Netflix, Spotify, LinkedIn, Facebook with the most visible examples • “You May Also Like” “People You May Know” “People to Follow” • Also silently power many other experiences • Product listings, up-sell options, add-ons, • Netflix —> $1MM for 10% better
  • 7. What data do you need? • Required for implicit data • User identifier • Product identifier • That’s it! • Further customization • Ratings (explicit data), counts • Side data
  • 8. Implicit data • User x product interactions • Consumed / used / clicked / etc.
  • 9. How do recommenders work? • Most basic: item similarity
  • 10. Matrix factorization • Treat users and products as a giant matrix with (very) many missing values • Users have latent factors that describe how much they like various genres • Items have latent factors that describe how much like each genre they are
  • 11. Matrix factorization • Turn this into a fill-in-the-missing-value exercise by learning the latent factors • Implicit or explicit data • Part of the winning formula for the Netflix Prize • Predict ratings or rankings
  • 17. Fill in the blanks • Learn the latent factors that minimize prediction error on the observed values • Fill in the missing values • Sort the list by predicted rating & recommend the unseen items
  • 18. Rankings? • Often less concerned with predicting precise scores • Just want to get the first few items right • Screen real estate is precious • Ranking factorization recommender
  • 19. Side features • Include information about users • Geographic, demographic, time of day, etc. • Include information about products • Product subtypes, geographic availability, etc. • Help with the cold start problem
  • 20. How to choose which model? • Select the appropriate model for your data (implicit/explicit), if you want side features or not, select hyperparameters, tune them… • … or let GraphLab Create do it for you and automatically tune hyperparameters
  • 21. Evaluation • Train on a portion of your data • Test on a held-out portion • Ratings: RMSE • Ranking: Precision, recall • Business metrics • Evaluate against popularity
  • 22. Live demo • Building and deploying a recommender system with GraphLab Create and Dato Predictive Services