SlideShare a Scribd company logo
Tague Griffith, Redis Labs
Deploying Real-Time Decision
Sevices using Redis
#MLSAIS12
Why Machine Learning
Teaching a computer, by example, an
algorithm that is too complex to program
Machine Learning Problems
Pick One of a Set
• Spam Detection
• Manufacturing defect
detection
• Handwriting analysis
• Decision Trees
• Naïve Bayes
• Logistic Regression
Score or Rank
• Recommendations
• Likelihood of
Purchase
• Linear Regression
• SVM
Classification Regression
Group Similar
• Find Similar Items
• Customer
segmentation
• Cohort detection
• K-Means
• K-Nearest Neighbors
• Hierarchical
Clustering
Clustering
Supervised Learning – Training Spam
Classifier
#MLSAIS12
Deploying a Spam Classifier
#MLSAIS12
How do we Build these Boxes
¯_( )_/¯
#MLSAIS12
• Building high performance and reliable
services are hard, isn't there something
we can deploy
Redis - ML
Typical Spark Application Structure
Spark Training
Data is loaded into Spark Model is saved in files
File System Custom Server
Model is loaded to your
custom app
Serving Client
Client App
#MLSAIS12
Redis-ML: Predictive Model Serving Engine
• Predictive models as native Redis types
• Perform evaluation directly in Redis
• Store training output as “hot model”
Spark Training
Data loaded into Spark Model is saved in
Redis-ML
Redis-ML
Serving Client
Client
App
Client
App
Client
App
Any Training
Platform
REmote DIctionary Server
Strings Hashes Lists
Sets Bitmaps
Hyperlog-
logs
Sorted
Sets
Geo-
spatial
Bitfield
A Quick Recap of Redis
Key
"I'm a Plain Text String!"
{ A: “foo”, B: “bar”, C: “baz” }
Strings / Bitmaps / BitFields
Hash Tables (objects!)
Linked Lists
Sets
Sorted Sets
Geo Sets
HyperLogLog
{ A , B , C , D , E }
[ A → B → C → D → E ]
{ A: 0.1, B: 0.3, C: 100, D: 1337 }
{ A: (51.5, 0.12), B: (32.1, 34.7) }
00110101 11001110 10101010
Redis Modules
• Any C/C++ program can now run on Redis
• Use existing or add new data-structures
• Enjoy simplicity, infinite scalability and high availability while
keeping the native speed of Redis
• Can be created by anyone
New Capabilities
New Commands
New Data Types
Redis ML Module
Redis Module
Tree Ensembles
Linear Regression
Logistic Regression
Matrix + Vector Operations
More to come...
Random Forest Model
• A collection of decision trees
• Supports classification & regression
• Splitter Node can be:
◦ Categorical (e.g. day == “Sunday”)
◦ Numerical (e.g. age < 43)
• Decision is taken by the majority of decision trees
Classic Tree Problem: Titanic Survival
YES
Sex =
Male ?
Age <
9.5?
Sibps >
2.5?
Survived
Died
SurvivedDied
NO
• Passenger Data encoded as feature vecto
• ML Algorithm learns the tree rules
• ID3, CART (RPART), etc.
• Tree rules used to infer results
Titanic Survival: Random Forest
YES
Sex =
Male ?
Age <
9.5?
*Sibps >
2.5?
Survived
Died
SurvivedDied
NO YES
Country=
US?
State =
CA?
Height>
1.60m?
Survived
Died
SurvivedDied
NO YES
Weight<
80kg?
I.Q<100?
Eye color
=blue?
Survived
Died
SurvivedDied
NO
Tree #1 Tree #2 Tree #3
Who Would Survive the Titanic
• John:
• Male, 34,
• Married w/ 2 kids
(Sibps=3)
• New York, USA
• 1.78m, 78kg
• 110 iq
• Blue eyes
Mathew:
• Male, 6
• 3 Sisters (Sibps=3)
• New York, USA
• 1.06m, 22.7 kg
• 100 iq
• Brown eyes
Let's use our forest to find out
Redis: Forest Data Type
Add nodes to a tree in a forest:
Perform classification/regression of a feature vector:
ML.FOREST.ADD <forestId> <treeId> <path>
[ [NUMERIC|CATEGORIC] <splitterAttr> <splitterVal> ] |
[LEAF] <predVal>
ML.FOREST.RUN <forestId> <features>
[CLASSIFICATION|REGRESSION]
Real World Challenge
• Ad serving company
• Need to serve 20,000 ads/sec @ 50msec data-center latency
• Runs 1k campaigns → 1K random forest
• Each forest has 15K trees
• On average each tree has 7 levels (depth)
Ad Serving costs: Homegrown v. Redis
Homegrown
1,247 x c4.8xlarge 35 x c4.8xlarge
Cut computing infrastructure
by 97%
Summary
• Train with Spark, Serve with Redis
• 97% resource cost serving
• Simplify ML lifecycle
• Redise (Cloud or Pack):
‒Scaling, HA, Performance
‒PAYG – cost optimized
‒Ease of use
‒Supported by the teams who created Spark and
Redis
Spark Training
Data loaded into Spark Model is saved in
Redis-ML
Redis-ML
Serving Client
Client
App
Client
App
Client
App
+
Thank you!

More Related Content

PPTX
Serving predictive models with Redis
Tague Griffith
 
PDF
Big Data LDN 2017: Serving Predictive Models with Redis
Matt Stubbs
 
PPTX
RedisConf17 - Redis Labs - Implementing Real-time Machine Learning with Redis-ML
Redis Labs
 
PDF
Building a Large Scale Recommendation Engine with Spark and Redis-ML with Sha...
Databricks
 
PDF
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Spark Summit
 
PDF
RedisConf18 - Making Real-Time Predictive Decisions with Redis
Redis Labs
 
PPTX
Automating the testing of your bi solution with n bi
Cédric Charlier
 
PPTX
How in memory technology will impact machine deep learning services (redis la...
Avner Algom
 
Serving predictive models with Redis
Tague Griffith
 
Big Data LDN 2017: Serving Predictive Models with Redis
Matt Stubbs
 
RedisConf17 - Redis Labs - Implementing Real-time Machine Learning with Redis-ML
Redis Labs
 
Building a Large Scale Recommendation Engine with Spark and Redis-ML with Sha...
Databricks
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Spark Summit
 
RedisConf18 - Making Real-Time Predictive Decisions with Redis
Redis Labs
 
Automating the testing of your bi solution with n bi
Cédric Charlier
 
How in memory technology will impact machine deep learning services (redis la...
Avner Algom
 

Similar to Deploying Real-Time Decision Services Using Redis with Tague Griffith (20)

PDF
Storage Systems For Scalable systems
elliando dias
 
PDF
Rails Tips and Best Practices
David Keener
 
PDF
BSSML17 - Deepnets
BigML, Inc
 
PDF
Artificial Intelligence Masterclass for managers and business heads
Spotle.ai
 
PDF
Elasticsearch in Production
foundsearch
 
PDF
Ds03 data analysis
DotNetCampus
 
PPT
OOUG - Oracle Performance Tuning with AAS
Kyle Hailey
 
PDF
Understanding your Data - Data Analytics Lifecycle and Machine Learning
Abzetdin Adamov
 
PDF
General Tips for participating Kaggle Competitions
Mark Peng
 
PPT
Data science: DATA MINING AND DATA WHEREHOUSE.ppt
shubhanshussm10
 
PPTX
DataScience-101
Karthikeyan VK
 
PPTX
Storage Systems for High Scalable Systems Presentation
andyman3000
 
PDF
Heart of Data Modeling Webinar: The Ticking Timebombs in Your Data Model
DATAVERSITY
 
PPT
Lecture2.ppt
sriRam132674
 
PPTX
Relational Database and mysql insight
mentallog
 
PDF
IPR Oracle Innovation Days 2015
Jurijs Jefimovs
 
PDF
Restructuring rails
Agile Commerce
 
PPTX
Data Modelling for security and privacy PRAGUE.pptx
Karen Lopez
 
PPTX
Analytics Boot Camp - Slides
Aditya Joshi
 
PDF
Data Vault 2.0 Demystified: East Coast Tour
WhereScape
 
Storage Systems For Scalable systems
elliando dias
 
Rails Tips and Best Practices
David Keener
 
BSSML17 - Deepnets
BigML, Inc
 
Artificial Intelligence Masterclass for managers and business heads
Spotle.ai
 
Elasticsearch in Production
foundsearch
 
Ds03 data analysis
DotNetCampus
 
OOUG - Oracle Performance Tuning with AAS
Kyle Hailey
 
Understanding your Data - Data Analytics Lifecycle and Machine Learning
Abzetdin Adamov
 
General Tips for participating Kaggle Competitions
Mark Peng
 
Data science: DATA MINING AND DATA WHEREHOUSE.ppt
shubhanshussm10
 
DataScience-101
Karthikeyan VK
 
Storage Systems for High Scalable Systems Presentation
andyman3000
 
Heart of Data Modeling Webinar: The Ticking Timebombs in Your Data Model
DATAVERSITY
 
Lecture2.ppt
sriRam132674
 
Relational Database and mysql insight
mentallog
 
IPR Oracle Innovation Days 2015
Jurijs Jefimovs
 
Restructuring rails
Agile Commerce
 
Data Modelling for security and privacy PRAGUE.pptx
Karen Lopez
 
Analytics Boot Camp - Slides
Aditya Joshi
 
Data Vault 2.0 Demystified: East Coast Tour
WhereScape
 
Ad

More from Databricks (20)

PPTX
DW Migration Webinar-March 2022.pptx
Databricks
 
PPTX
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
PPT
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
PPTX
Data Lakehouse Symposium | Day 2
Databricks
 
PPTX
Data Lakehouse Symposium | Day 4
Databricks
 
PDF
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
PDF
Democratizing Data Quality Through a Centralized Platform
Databricks
 
PDF
Learn to Use Databricks for Data Science
Databricks
 
PDF
Why APM Is Not the Same As ML Monitoring
Databricks
 
PDF
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
PDF
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
PDF
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
PDF
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
PDF
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
PDF
Sawtooth Windows for Feature Aggregations
Databricks
 
PDF
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
PDF
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
PDF
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
PDF
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
PDF
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
Ad

Recently uploaded (20)

PPTX
Power BI in Business Intelligence with AI
KPR Institute of Engineering and Technology
 
PPTX
Future_of_AI_Presentation for everyone.pptx
boranamanju07
 
PDF
Research about a FoodFolio app for personalized dietary tracking and health o...
AustinLiamAndres
 
PPTX
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
PDF
Chad Readey - An Independent Thinker
Chad Readey
 
PDF
Practical Measurement Systems Analysis (Gage R&R) for design
Rob Schubert
 
PDF
TIC ACTIVIDAD 1geeeeeeeeeeeeeeeeeeeeeeeeeeeeeer3.pdf
Thais Ruiz
 
PPTX
Introduction-to-Python-Programming-Language (1).pptx
dhyeysapariya
 
PDF
D9110.pdfdsfvsdfvsdfvsdfvfvfsvfsvffsdfvsdfvsd
minhn6673
 
PPTX
Introduction to Data Analytics and Data Science
KavithaCIT
 
PPTX
Multiscale Segmentation of Survey Respondents: Seeing the Trees and the Fores...
Sione Palu
 
PPTX
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 
PDF
Technical Writing Module-I Complete Notes.pdf
VedprakashArya13
 
PPTX
INFO8116 - Week 10 - Slides.pptx data analutics
guddipatel10
 
PDF
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 
PPTX
White Blue Simple Modern Enhancing Sales Strategy Presentation_20250724_21093...
RamNeymarjr
 
PPTX
Probability systematic sampling methods.pptx
PrakashRajput19
 
PPTX
The whitetiger novel review for collegeassignment.pptx
DhruvPatel754154
 
PPTX
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
PDF
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Power BI in Business Intelligence with AI
KPR Institute of Engineering and Technology
 
Future_of_AI_Presentation for everyone.pptx
boranamanju07
 
Research about a FoodFolio app for personalized dietary tracking and health o...
AustinLiamAndres
 
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
Chad Readey - An Independent Thinker
Chad Readey
 
Practical Measurement Systems Analysis (Gage R&R) for design
Rob Schubert
 
TIC ACTIVIDAD 1geeeeeeeeeeeeeeeeeeeeeeeeeeeeeer3.pdf
Thais Ruiz
 
Introduction-to-Python-Programming-Language (1).pptx
dhyeysapariya
 
D9110.pdfdsfvsdfvsdfvsdfvfvfsvfsvffsdfvsdfvsd
minhn6673
 
Introduction to Data Analytics and Data Science
KavithaCIT
 
Multiscale Segmentation of Survey Respondents: Seeing the Trees and the Fores...
Sione Palu
 
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 
Technical Writing Module-I Complete Notes.pdf
VedprakashArya13
 
INFO8116 - Week 10 - Slides.pptx data analutics
guddipatel10
 
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 
White Blue Simple Modern Enhancing Sales Strategy Presentation_20250724_21093...
RamNeymarjr
 
Probability systematic sampling methods.pptx
PrakashRajput19
 
The whitetiger novel review for collegeassignment.pptx
DhruvPatel754154
 
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 

Deploying Real-Time Decision Services Using Redis with Tague Griffith

  • 1. Tague Griffith, Redis Labs Deploying Real-Time Decision Sevices using Redis #MLSAIS12
  • 3. Teaching a computer, by example, an algorithm that is too complex to program
  • 4. Machine Learning Problems Pick One of a Set • Spam Detection • Manufacturing defect detection • Handwriting analysis • Decision Trees • Naïve Bayes • Logistic Regression Score or Rank • Recommendations • Likelihood of Purchase • Linear Regression • SVM Classification Regression Group Similar • Find Similar Items • Customer segmentation • Cohort detection • K-Means • K-Nearest Neighbors • Hierarchical Clustering Clustering
  • 5. Supervised Learning – Training Spam Classifier #MLSAIS12
  • 6. Deploying a Spam Classifier #MLSAIS12
  • 7. How do we Build these Boxes ¯_( )_/¯ #MLSAIS12
  • 8. • Building high performance and reliable services are hard, isn't there something we can deploy
  • 10. Typical Spark Application Structure Spark Training Data is loaded into Spark Model is saved in files File System Custom Server Model is loaded to your custom app Serving Client Client App #MLSAIS12
  • 11. Redis-ML: Predictive Model Serving Engine • Predictive models as native Redis types • Perform evaluation directly in Redis • Store training output as “hot model” Spark Training Data loaded into Spark Model is saved in Redis-ML Redis-ML Serving Client Client App Client App Client App Any Training Platform
  • 12. REmote DIctionary Server Strings Hashes Lists Sets Bitmaps Hyperlog- logs Sorted Sets Geo- spatial Bitfield
  • 13. A Quick Recap of Redis Key "I'm a Plain Text String!" { A: “foo”, B: “bar”, C: “baz” } Strings / Bitmaps / BitFields Hash Tables (objects!) Linked Lists Sets Sorted Sets Geo Sets HyperLogLog { A , B , C , D , E } [ A → B → C → D → E ] { A: 0.1, B: 0.3, C: 100, D: 1337 } { A: (51.5, 0.12), B: (32.1, 34.7) } 00110101 11001110 10101010
  • 14. Redis Modules • Any C/C++ program can now run on Redis • Use existing or add new data-structures • Enjoy simplicity, infinite scalability and high availability while keeping the native speed of Redis • Can be created by anyone New Capabilities New Commands New Data Types
  • 15. Redis ML Module Redis Module Tree Ensembles Linear Regression Logistic Regression Matrix + Vector Operations More to come...
  • 16. Random Forest Model • A collection of decision trees • Supports classification & regression • Splitter Node can be: ◦ Categorical (e.g. day == “Sunday”) ◦ Numerical (e.g. age < 43) • Decision is taken by the majority of decision trees
  • 17. Classic Tree Problem: Titanic Survival YES Sex = Male ? Age < 9.5? Sibps > 2.5? Survived Died SurvivedDied NO • Passenger Data encoded as feature vecto • ML Algorithm learns the tree rules • ID3, CART (RPART), etc. • Tree rules used to infer results
  • 18. Titanic Survival: Random Forest YES Sex = Male ? Age < 9.5? *Sibps > 2.5? Survived Died SurvivedDied NO YES Country= US? State = CA? Height> 1.60m? Survived Died SurvivedDied NO YES Weight< 80kg? I.Q<100? Eye color =blue? Survived Died SurvivedDied NO Tree #1 Tree #2 Tree #3
  • 19. Who Would Survive the Titanic • John: • Male, 34, • Married w/ 2 kids (Sibps=3) • New York, USA • 1.78m, 78kg • 110 iq • Blue eyes Mathew: • Male, 6 • 3 Sisters (Sibps=3) • New York, USA • 1.06m, 22.7 kg • 100 iq • Brown eyes Let's use our forest to find out
  • 20. Redis: Forest Data Type Add nodes to a tree in a forest: Perform classification/regression of a feature vector: ML.FOREST.ADD <forestId> <treeId> <path> [ [NUMERIC|CATEGORIC] <splitterAttr> <splitterVal> ] | [LEAF] <predVal> ML.FOREST.RUN <forestId> <features> [CLASSIFICATION|REGRESSION]
  • 21. Real World Challenge • Ad serving company • Need to serve 20,000 ads/sec @ 50msec data-center latency • Runs 1k campaigns → 1K random forest • Each forest has 15K trees • On average each tree has 7 levels (depth)
  • 22. Ad Serving costs: Homegrown v. Redis Homegrown 1,247 x c4.8xlarge 35 x c4.8xlarge Cut computing infrastructure by 97%
  • 23. Summary • Train with Spark, Serve with Redis • 97% resource cost serving • Simplify ML lifecycle • Redise (Cloud or Pack): ‒Scaling, HA, Performance ‒PAYG – cost optimized ‒Ease of use ‒Supported by the teams who created Spark and Redis Spark Training Data loaded into Spark Model is saved in Redis-ML Redis-ML Serving Client Client App Client App Client App +