Machine Learning logistics

© 2017 MapR Technologies 1
Machine Learning Model Management

Contact Information
Ted Dunning, PhD
Chief Application Architect, MapR Technologies
Committer, PMC member, board member, ASF
O’Reilly author
Email tdunning@mapr.com tdunning@apache.org
Twitter @Ted_Dunning

Machine Learning Everywhere
Image courtesy Mtell used with permission.Images © Ellen Friedman.

Traditional View

Traditional View: This isn’t the whole story

90% of the effort in successful machine
learning isn’t in the training or model dev…
It’s the logistics

Why?
• Just getting the training data is hard
– Which data? How to make it accessible? Multiple sources!
– New kinds of observations force restarts
– Requires a ton of domain knowledge
• The myth of the unitary model
– You can’t train just one
– You will have dozens of models, likely hundreds or more
– Handoff to new versions is tricky
– You have to get run-time to be sure about which is better


What Machine Learning Tool is Best?
• Most successful groups keep several “favorite” machine
learning tools at hand
– No single tool is best in every situation
• The most important tool is a platform that supports logistics well
– Don’t have to do everything at the application level
– Lots of what matters can be handled at the platform level
• A good design for the logistics can make a big difference

Some Gotchas
• Ops-oriented people will not “get it” regarding modeling
subtleties
• Data scientists will not “get it” regarding operational realities
• Therefore, modelers have to deliver self-contained models
• And, ops has to provide pre-wired structure

Rendezvous Architecture
Input Scores
RendezvousModel 1
Model 2
Model 3
request
response
Results

Rendezvous to the Rescue: Better ML Logistics
• Stream-1st architecture is a powerful approach with surprisingly
widespread advantages
– Innovative technologies emerging to for streaming data
• Microservices approach provides flexibility
– Streaming supports microservices (if done right)
• Containers remove surprises
– Predictable environment for running models

Rendezvous: Mainly for Decisioning Engines
• Decisioning models
– Looking for a “right answer”
– Simpler than reinforcement learning
• Examples include:
– Fraud detection
– Predictive analytics / market prediction
– Churn prediction (as in telecommunications)
– Yield optimization
– Deep learning in form of speech or image recognition, in some cases

Why Stream?
Munich surfing wave Image © 2017 Ellen Friedman

Stream-1st Architecture: Basis for MicroServices
Stream instead of database as the shared “truth”
POS
1..n
Fraud
detector
Last card
use
Updater
Card
analytics
Other
card activity
Image © 2016 Ted Dunning & Ellen Friedman from Chap 6 of O’Reilly book Streaming Architecture used with permission

Streaming Isolates Services
stream
Data
source
Consumer

With MapR, Geo-Distributed Data Appears Local
stream
stream
Data
source
Consumer

With MapR, Geo-distributed Data Appears Local
stream
stream
Data
source
ConsumerGlobal Data Center
Regional Data Center

Features of Good Streaming
• It is Persistent
– Messages stick around for other consumers
– Consumers don’t affect producers
– Consumer doesn’t have to be online when message arrives
• It is Performant
– You don’t have to worry if a stream can keep up
• It is Pervasive
– It is there whenever you need it, no need to deploy anything
– How much work is it to create a new file? Why harder for a stream?

Stream transport supports
microservices

But we talked about decision
engines?!?

What We Ultimately Want
request
response
Model

But This Isn’t The Answer
Model 1
request
response
Load
balancer
Model 2
Model 3

First Try with Streams
Input
Model 1
Model 2
Model 3
request
response
?

First Rendezvous
Input Scores
RendezvousModel 1
Model 2
Model 3
request
response
Results

Some Key Points
• Note that all models see identical inputs
• All models run in production setting
• All models send scores to same stream
• The rendezvous server decides which scores to ignore
• Roll forward, roll back, correlated comparison are all now trivial

Reality Check, Injecting External State
Model 1
Model 2
Model 3
request
Raw
Add
external
data
Input
Database
The world

Recording Raw Data (as it really was)
Input
Scores
Decoy
Model 2
Model 3
Archive

Quality & Reproducibility of Input Data is Important!
• Recording raw-ish data is really a big deal
– Data as seen by a model is worth gold
– Data reconstructed later often has time-machine leaks
– Databases were made for updates, streams are safer
• Raw data is useful for non-ML cases as well (think flexibility)
• Decoy model records training data as seen by models under
development & evaluation

Canary for Comparison
Real
model
∆
Result
Canary
Decoy
Archive
Input

What Does the Canary Do?
• The canary is a real model, but is very rarely updated
• The canary results are almost never used for decisioning
• The virtue of the canary is stability
• Comparing to the canary results gives insight into new models

Isolated Development With Stream Replication
Model 1
Model 2
Model 3
request
Raw
Add
external
data
Input
Internal 1
Internal 2
Internal 3
The world
Model 4
Raw
New
external
data
Input
Internal 4
Production
Development

Scores
ArchiveDecoy
m1
m2
m3
Features /
proﬁles
InputRaw

ResultsRendezvousScores
ArchiveDecoy
m1
m2
m3
Features /
proﬁles
InputRaw

Metrics
Metrics
ResultsRendezvousScores
ArchiveDecoy
m1
m2
m3
Features /
proﬁles
InputRaw

Models in production live in the real
world:
Conditions may (will) change

Not Such Bad Ideas
• Keep models running “in the wings”
– Don’t wait until conditions change to start building the next model
– Keep new short-history models ready to roll, some graybeards as well
• Hot hand-off
– With rendezvous: just stop ignoring the new best model
• Deploy a canary server
– Keep an old model active as a reference
– If it was 90% correct, difference with any better model should be small
– Score distribution should be roughly constant

Correlated Comparison of Score Quantiles

Sample Model Cascade
A
B
Fraud
Fraud
Clean
Clean
Fraud
Assume that finding more frauds is all we care to do

Some Data

Consisting of Type 1

And Type 2

Sample Model Cascade
A
B
Fraud
Fraud
Clean
Clean
Fraud
Good with type 1
Good with type 2

Baseline Conditions
• Model A
– 80% recall on type 1, 0% recall on type 2 (40% net)
• Model B
– 0% recall on type 1, 80% recall on type 2 (40% net)
• Combined
– No overlap in responses
– 80% recall on type 1 (due to model A)
– 80% recall on type 2 (due to model B)
– 80% recall overall

“New and Improved”
• Suppose model A is “improved”
– Before: 80% recall on type 1, 0% recall on type 2 (40% net)
– After: 40% recall on type 1, 100% also on type 2 (70% net)
• Combined after change
– Huge overlap in responses
– Model B has no effect
– 70% recall overall

Coupling Paradox

Is There Any Hope?
• This kind of problem is HARD
– Do your competitor’s and your own marketing model couple?
• Where possible, use ensembles instead of cascades
– Not as simple as it sounds
• Where possible, deploy composite models as units
– Not as simple as it sounds
• Always measure everything!

How to Do Better
• Data + the right question + domain knowledge matter!
• Prioritize – put serious effort into infrastructure
– DataOps requires more than just data science
• Persist – use streams to keep data around
• Measure – everything, and record it
• Meta-analyze – understand and see what is happening
• Containerize – make deployment repeatable, easy
• Oh… don’t forget to do some machine learning, too

Additional Resources
O’Reilly report by Ted Dunning & Ellen Friedman © March 2017
Read free courtesy of MapR:
https://mapr.com/geo-distribution-big-data-and-analytics/
O’Reilly book by Ted Dunning & Ellen Friedman
© March 2016
https://mapr.com/streaming-architecture-using-
apache-kafka-mapr-streams/

O’Reilly book by Ted Dunning & Ellen Friedman
© June 2014
https://mapr.com/practical-machine-learning-
new-look-anomaly-detection/
O’Reilly book by Ellen Friedman & Ted Dunning
© February 2014
https://mapr.com/practical-machine-learning/

by Ellen Friedman 8 Aug 2017 on MapR blog:
https://mapr.com/blog/tensorflow-mxnet-caffe-h2o-which-ml-best/
by Ted Dunning 13 Sept 2017 in
InfoWorld:
https://www.infoworld.com/article/3223
688/machine-learning/machine-
learning-skills-for-software-
engineers.html

New book: Machine Learning Logistics
Model Management in the Real World
O’Reilly book by Ellen Friedman & Ted Dunning © Sept 2017
Pre-register for a free pdf copy of book when it becomes available 26th
September, courtesy of MapR
http://info.mapr.com/2017_Content_Machine-Learning-
Logistics_eBook_Prereg_RegistrationPage.html
Going to Strata Data NYC? Book will be released 26 Sept 2017:
Visit MapR booth for free book signings or to talk about logistics

Please support women in tech – help build
girls’ dreams of what they can accomplish
© Ellen Friedman 2015#womenintech #datawomen

Q&A
@mapr
tdunning@mapr.com
ENGAGE WITH US
@ Ted_Dunning

Machine Learning logistics

More Related Content

What's hot (20)

Similar to Machine Learning logistics (20)

More from Ted Dunning (12)

Recently uploaded (20)

Machine Learning logistics