SlideShare a Scribd company logo
Using MLOps to Bring ML to Production/The Promise of MLOps
Foundation for ML
Your data +
Microsoft data
Breakthrough
advancements
Data Cloud Models
Power of Azure
SpeechVision Language
2016 2017 20182018
Microsoft ML breakthroughs
Microsoft 365
ML at Microsoft
| Research
ML at scale
Monthly active
Office 365 users
using AI
180
million
Questions Asked
of Cortana
18
Billion
Number of Signals
Analyzed to Block
Emerging Threats
DAILY
6.5
Trillion
But ML is HARD!
Building a model
Building
a model
Data ingestion Data analysis
Data
transformation
Data validation Data splitting
Trainer
Model
validation
Training
at scale
LoggingRoll-out Serving Monitoring
Ok, but, like, I’m
a data scientist. IDGAF
I don’t care
about all that.
Yes You Do!
11
Cowboys and Ranchers Can Be Friends!
SRE/ML EngineersData Scientist
• Quick iteration
• Frameworks they
understand
• Best of breed tools
• No management
headaches
• Unlimited scale
• Reuse of tooling and
platforms
• Corporate compliance
• Observability
• Uptime
Haven’t I Heard This Before?
GitOps = Git + Dev + Ops
GitOps
== VELOCITY and SECURITY
MLOps!
MLOps = ML + DEV + OPS
Experiment
Data Acquisition
Business Understanding
Initial Modeling
Develop
Modeling
Operate
Continuous Delivery
Data Feedback Loop
System + Model Monitoring
ML
+ Testing
Continuous Integration
Continuous Deployment
MLOps Benefits
• Code drives generation
and deployments
• Pipelines are
reproducible and
verifiable
• All artifacts can be
tagged and audited
• SWE best practices for
quality control
• Offline comparisons of
model quality
• Minimize bias and
enable explainability
• Controlled rollout
capabilities
• Live comparison of
predicted vs. expected
performance
• Results fed back to
watch for drift and
improve model
Automation /
Observability Validation
Reproducibility
/Auditability
== VELOCITY and SECURITY (For ML)
Internal MLOps Platforms
FBLearner FlowTensorFlow Extended
Uber’s Michelangelo
Microsoft Aether
But I Don’t Work at a
Big Company With
Thousands of
ML Engineers!
Build Your Own MLOps Platform
And many MANY more…
+ +
Cloud Provider
MLOps Platforms
Real World Multi-Cloud
CI/CD Pipeline
Process Train Stage Serve
Data
Distributed Cloud
SRE/ML Engineers
Data Scientist
ENV
#1
ENV
#2
Azure DevOps Pipelines
Cloud-hosted pipelines for Linux, Windows and macOS.
Any language, any platform, any cloud
Build, test, and deploy Node.js, Python, 
Java, PHP,
Ruby, C/C++, .NET, Android, and iOS apps. Run in
parallel on Linux, macOS, and Windows. Deploy to
Azure, AWS, GCP or on-premises
Extensible
Explore and implement a wide range of community-
built build, test, and deployment tasks, along with
hundreds of extensions from Slack to SonarCloud.
Support for YAML, reporting and more
Containers and Kubernetes
Easily build and push images to container registries
like Docker Hub and Azure Container Registry.
Deploy containers to individual hosts or Kubernetes.
Azure DevOps + Azure ML
First Class Model Training Tasks
CI pipeline captures:
1. Create sandbox
2. Run unit tests and code quality checks
3. Attach to compute
4. Run training pipeline
5. Evaluate model
6. Register model
Automated Deployment
CD pipeline captures:
1. Package model into container
image
2. Validate and profile model
3. Deploy model to DevTest (ACI)
4. If all is well, proceed to rollout
to AKS
Everything is done via the CLI
Model Versioning & Storage
• which data,
• which experiment / previous model(s),
• where’s the code / notebook)
• Was it converted / quantized?
• Private / compliant data
Model Validation
• Data (changes to shape / profile)
• Model in isolation (offline A/B)
• Model + app (functional testing)
• Only deploy after initial validation passes
• Ramp up traffic to new model using A/B
experimentations
• Functional behavior
• Performance characteristics
Model Profiling
Model Deployment
• Focus on ML, not DevOps
• Get telemetry for service health and model behavior
• code-generation
• API specifications / interfaces
• Cloud Services
• Mobile / Embedded Applications
• Edge Devices
• Quantize / optimize models for target platform
• Compliant + Safe
Seems Like a Lot of Work…
33
MLOps Gets You to Production
• End-to-end ownership by data science teams
using SWE best practices
• Continuously deliver of value to end users.
• Enables lineage, auditability and regulatory
compliance through consistency
Ok… but WHY?
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
Does My Model
Actually Work?
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Time to test out
my model…
Laptop The Cloud
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Laptop The Cloud
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Looks good to
me! To Production!
Laptop The Cloud
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Laptop The Cloud
Wait, what?
Oh… oh no…
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Laptop The Cloud
WOAH there.
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Laptop The Cloud
WOAH there.
Source Control
What is
happening…
Source Control
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Laptop The Cloud
A Small Example of Issues You Can Have…
• Inappropriate HW/SW stack
• Mismatched driver versions
• Crash looping deployment
• Data/model versioning [Nick Walsh]
• Non-standard images/OS version
• Pre-processing code doesn’t match
production pre-processing
• Production data doesn’t match
training/test data
• Output of the model doesn’t match
application expectations
• Hand-coded heuristics better than model
[Adam Laiacano]
• Model freshness (train on out-of-date
data/input shape changed)
• Test/production statistics/population
shape skew
• Overfitting on training/test data
• Bias introduction (or not tested)
• Over/under HW provisioning
• Latency issues
• Permissions/certs
• Failure to obey health checks
• Killed production model before roll out
of new/in wrong order
• Thundering herd for new model
• Logging to the wrong location
• Storage for model not allocated
properly/accessible by deployment
tooling
• Route to artifacts not available for
download
• API signature changes not
propagated/expected
• Cross-data center latency
• Expected benefit doesn’t materialize (e.g.
multiple components in the app change
simultaneously)
• Get wrong/no traffic because A/B config
didn’t roll out
• Get too much traffic too soon (expected
to canary/exponential roll out)
• Lack of visibility into real-time model
behavior (detecting data drift, live data
distribution vs train data, etc) [Nick
Walsh]
• Outliers not predicted [MikeBSilverman]
• Change was a good change, but didn’t
communicate with the rest of the team
(so you must roll back)
• No dates! (date to measure
impact/improvement against a pre-
agreed measure; date scheduled to
assess data changes) [Mary Branscombe]
• No CI/CD; manual changes untracked
[Jon Peck]
• LACK OF DOCUMENTATION!! (the
problem, the testing, the solution, lots
more) [Terry Christiani]
• Successful model causes pain elsewhere
in the organization (e.g. detecting faults
previously missed) [Mark Round]
Or It Just Doesn’t Work!
At All!
Does My Model Actually Work?
SRE/ML EngineersData Scientist
Laptop The Cloud
Source Control
Automated
Validation &
Profiling
Package
For Rollout
Explain Model
& Look for
Bias
Clean/
Minimize
Code
Sane
Deployment
Nice. Nice.
ü
But I Can Do All
These Manually…
No.
MLOps is a Platform and a Philosophy
Even if:
o Every data scientist trained...
o And you had all the tools necessary...
o And they all worked together...
o And your SREs understood ML modeling...
o And and and and ...
You’d still need a permenant, repeatble
record of what you did
That’s MLOps!
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
What Did My
Customers See?
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
I’d Like a loan,
please.
Source Control
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
No.
Source Control
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
Ok, but why?
Source Control
Source Control
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
Uh oh.
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
Lawyer
LawyerLawyer
It’s Not Just About Explainability!
• Yes, models are complicated
• But, that’s not enough:
o What data did you train on?
o How did you transform/exclude outliers?
o What are the data statistics?
o Did anything change between code and production?
o What model did you actually serve (to this person)?
• MLOps can help!
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
Source Control
Automated
Validation &
Profiling
Package
For Rollout
Explain Model
& Look for
Bias
Clean/
Minimize
Code
Sane
Deployment
32c04681d7573
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
Automated
Validation &
Profiling
Package
For Rollout
Explain Model
& Look for
Bias
Clean/
Minimize
Code
Sane
Deployment
Source Control
Immutable
Metadata Store
b151f8e65b32a c7f4e7607b4b7 0ef1d58921d89 e2e1e994c4251 786c8e57a6d51 9ce88802f0759
9ce88802f0759
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
Automated
Validation &
Profiling
Package
For Rollout
Explain Model
& Look for
Bias
Clean/
Minimize
Code
Sane
Deployment
Source Control
Immutable
Metadata Store
b151f8e65b32a c7f4e7607b4b7 0ef1d58921d89 e2e1e994c4251 786c8e57a6d51 9ce88802f0759
32c04681d7573
Why didn’t I get
a loan?
9ce88802f0759
What Did My Customers See?
SRE/ML Engineers
The Cloud
Front End
Model Server
Customer
Automated
Validation &
Profiling
Package
For Rollout
Explain Model
& Look for
Bias
Clean/
Minimize
Code
Sane
Deployment
Source Control
Immutable
Metadata Store
b151f8e65b32a c7f4e7607b4b7 0ef1d58921d89 e2e1e994c4251 786c8e57a6d51 9ce88802f0759
32c04681d7573
32c04681d7573
9ce88802f0759
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
Is My Model
Still Good?
Is My Model Still Good?
SRE/ML Engineers
The Cloud
There is a
blue or
orange
DUCK inside
this barn.
What color
is the duck?
Let’s Use Machine
Learning!!
Is My Model Still Good?
SRE/ML Engineers
The Cloud
Front End
Model Server
f7c5f9fe7b762
It’s a
duck!
BLUE
There is a
blue or
orange
DUCK inside
this barn.
What color
is the duck?
But wait...
Is My Model Still Good?
SRE/ML Engineers
The Cloud
Front End
Model Server
f7c5f9fe7b762
It’s a
duck!
BLUE
5 Blue Ducks
995 Yellow Ducks
Accuracy = 99%
False Positive = 1%
???????????????????
Thomas
Bayes
𝑷 𝑨| 𝑩 =
𝑷 𝑩| 𝑨 ⋅ 𝑷 𝑨
𝑷 𝑩
Bayes’ Theorem
Accuracy depends on
the population
distribution!
Is My Model Still Good?
SRE/ML Engineers
The Cloud
Front End
Model Server
f7c5f9fe7b762
It’s a
duck!
BLUE
995 Yellow Ducks
5 Blue Ducks
WRONG 2/3rd of the Time!
Accuracy = 99%
False Positive = 1%
???????????????????
Who cares…
This Can Be
Addressed!
Is My Model Still Good?
SRE/ML Engineers
The Cloud
Front End
Model Server
f7c5f9fe7b762
It’s a
duck!
BLUE
995 Yellow Ducks
5 Blue Ducks
Model Server
d4093cc84b267
But…
Is My Model Still Good?
SRE/ML Engineers
The Cloud
Front End
Model Server
995 Yellow Ducks
5 Blue Ducks
d4093cc84b267
Is My Model Still Good?
SRE/ML Engineers
The Cloud
Front End
Model Server 500 Yellow Ducks
500 Blue Ducks
d4093cc84b267
Is My Model Still Good?
• Models != Code – they can go stale... QUICKLY.
• IMPORTANT:
o Watch your model & data for drift from training
o Regularly (if not continuously) retrain, even before
performance begins to fail
o Multiple versions rollbacks are not uncommon!
• Without an e2e MLOps pipeline, many of the
above are O(really really hard)!
What Does All This Stuff Solve For?
1. Does My Model Actually Work?
2. What Did My Customers See?
3. Is My Model Still Good?
Next for MLOps
MLOps Gives* You…
• Software best practices for building machine
learning solutions
• Repeatable workflow for training a model and
rolling it out to production
• An immutable record of what’s actually running
• Lineage of model creation including data sources
• Acceleration from code to customer benefits
* Requires some human and software work
What’s Next for MLOps
• Simplify monitoring and retraining
• Extend MLOps for data incl prep and profiling
• Enterprise features
o Test cases
o Auditing
o Security
o Resource management (bin packing / resource optimization)
o Network isolation
• Metadata and API standards
Or, better yet, you tell us!
It’s a whole new world
• Data science will touch
EVERY industry.
• We can’t ask people to
become a PhD in statistics
though.
• How do WE help everyone
take advantage of this
transformation?
me: David Aronchick (david.aronchick@microsoft.com)
twitter: @aronchick
github:
• https://github.com/aronchick/kubeflow-and-mlops
• https://aka.ms/mlops
THANK YOU!
Using MLOps to Bring ML to Production/The Promise of MLOps

More Related Content

What's hot (20)

PDF
MLOps Using MLflow
Databricks
 
PDF
What is MLOps
Henrik Skogström
 
PPTX
MLOps.pptx
AllenPeter7
 
PDF
ML-Ops how to bring your data science to production
Herman Wu
 
PDF
MLOps for production-level machine learning
cnvrg.io AI OS - Hands-on ML Workshops
 
PPTX
MLOps and Data Quality: Deploying Reliable ML Models in Production
Provectus
 
PDF
Ml ops past_present_future
Nisha Talagala
 
PDF
Apply MLOps at Scale by H&M
Databricks
 
PPTX
MLOps with Azure DevOps
Marco Parenzan
 
PDF
The A-Z of Data: Introduction to MLOps
DataPhoenix
 
PDF
MLOps with Kubeflow
Saurabh Kaushik
 
PDF
Exploring Opportunities in the Generative AI Value Chain.pdf
Dung Hoang
 
PDF
Apply MLOps at Scale
Databricks
 
PDF
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
Animesh Singh
 
PDF
Learn to Use Databricks for the Full ML Lifecycle
Databricks
 
PPTX
Using Generative AI
Mark DeLoura
 
PDF
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
Aggregage
 
PDF
Gen AI Cognizant & AWS event presentation_12 Oct.pdf
PhilipBasford
 
PDF
Managing the Machine Learning Lifecycle with MLOps
Fatih Baltacı
 
PDF
Seamless MLOps with Seldon and MLflow
Databricks
 
MLOps Using MLflow
Databricks
 
What is MLOps
Henrik Skogström
 
MLOps.pptx
AllenPeter7
 
ML-Ops how to bring your data science to production
Herman Wu
 
MLOps for production-level machine learning
cnvrg.io AI OS - Hands-on ML Workshops
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
Provectus
 
Ml ops past_present_future
Nisha Talagala
 
Apply MLOps at Scale by H&M
Databricks
 
MLOps with Azure DevOps
Marco Parenzan
 
The A-Z of Data: Introduction to MLOps
DataPhoenix
 
MLOps with Kubeflow
Saurabh Kaushik
 
Exploring Opportunities in the Generative AI Value Chain.pdf
Dung Hoang
 
Apply MLOps at Scale
Databricks
 
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
Animesh Singh
 
Learn to Use Databricks for the Full ML Lifecycle
Databricks
 
Using Generative AI
Mark DeLoura
 
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
Aggregage
 
Gen AI Cognizant & AWS event presentation_12 Oct.pdf
PhilipBasford
 
Managing the Machine Learning Lifecycle with MLOps
Fatih Baltacı
 
Seamless MLOps with Seldon and MLflow
Databricks
 

Similar to Using MLOps to Bring ML to Production/The Promise of MLOps (20)

PDF
Rsqrd AI: How to Design a Reliable and Reproducible Pipeline
Sanjana Chowdhury
 
PPTX
DevOps for Machine Learning overview en-us
eltonrodriguez11
 
PDF
MLOps – Applying DevOps to Competitive Advantage
DATAVERSITY
 
PPTX
Why do the majority of Data Science projects never make it to production?
Itai Yaffe
 
PDF
Machine Learning Operations Cababilities
davidsh11
 
PDF
Productionising Machine Learning Models
Tash Bickley
 
PPTX
CNCF-Istanbul-MLOps for Devops Engineers.pptx
cansukavili1
 
PPTX
MLOps.pptx
sundharakumarkb1
 
PDF
CI/CD for Machine Learning
C4Media
 
PDF
DevOps Days Rockies MLOps
Matthew Reynolds
 
PDF
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Lviv Startup Club
 
PDF
Azure Engineering MLOps
Minesh A. Jethva
 
PPTX
Why is dev ops for machine learning so different - dataxdays
Ryan Dawson
 
PDF
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Databricks
 
PDF
Rsqrd AI: From R&D to ROI of AI
Sanjana Chowdhury
 
PDF
[AI] ML Operationalization with Microsoft Azure
Korkrid Akepanidtaworn
 
PPTX
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
Jasjeet Thind
 
PDF
Practical Mlops Operationalizing Machine Learning Models 1st Edition Noah Gift
prgisohlt121
 
PDF
Fundamental MLOps
Saripudin Gon
 
PDF
Machine Learning Teams - Full Stack Deep Learning
Sergey Karayev
 
Rsqrd AI: How to Design a Reliable and Reproducible Pipeline
Sanjana Chowdhury
 
DevOps for Machine Learning overview en-us
eltonrodriguez11
 
MLOps – Applying DevOps to Competitive Advantage
DATAVERSITY
 
Why do the majority of Data Science projects never make it to production?
Itai Yaffe
 
Machine Learning Operations Cababilities
davidsh11
 
Productionising Machine Learning Models
Tash Bickley
 
CNCF-Istanbul-MLOps for Devops Engineers.pptx
cansukavili1
 
MLOps.pptx
sundharakumarkb1
 
CI/CD for Machine Learning
C4Media
 
DevOps Days Rockies MLOps
Matthew Reynolds
 
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Lviv Startup Club
 
Azure Engineering MLOps
Minesh A. Jethva
 
Why is dev ops for machine learning so different - dataxdays
Ryan Dawson
 
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Databricks
 
Rsqrd AI: From R&D to ROI of AI
Sanjana Chowdhury
 
[AI] ML Operationalization with Microsoft Azure
Korkrid Akepanidtaworn
 
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
Jasjeet Thind
 
Practical Mlops Operationalizing Machine Learning Models 1st Edition Noah Gift
prgisohlt121
 
Fundamental MLOps
Saripudin Gon
 
Machine Learning Teams - Full Stack Deep Learning
Sergey Karayev
 
Ad

More from Weaveworks (20)

PDF
Weave AI Controllers (Weave GitOps Office Hours)
Weaveworks
 
PDF
Flamingo: Expand ArgoCD with Flux (Office Hours)
Weaveworks
 
PDF
Webinar: Capabilities, Confidence and Community – What Flux GA Means for You
Weaveworks
 
PDF
Six Signs You Need Platform Engineering
Weaveworks
 
PDF
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
Weaveworks
 
PDF
Webinar: End to End Security & Operations with Chainguard and Weave GitOps
Weaveworks
 
PDF
Flux Beyond Git Harnessing the Power of OCI
Weaveworks
 
PDF
Automated Provisioning, Management & Cost Control for Kubernetes Clusters
Weaveworks
 
PDF
How to Avoid Kubernetes Multi-tenancy Catastrophes
Weaveworks
 
PDF
Building internal developer platform with EKS and GitOps
Weaveworks
 
PDF
GitOps Testing in Kubernetes with Flux and Testkube.pdf
Weaveworks
 
PDF
Intro to GitOps with Weave GitOps, Flagger and Linkerd
Weaveworks
 
PDF
Implementing Flux for Scale with Soft Multi-tenancy
Weaveworks
 
PDF
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKS
Weaveworks
 
PDF
The Story of Flux Reaching Graduation in the CNCF
Weaveworks
 
PDF
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...
Weaveworks
 
PDF
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...
Weaveworks
 
PDF
Flux’s Security & Scalability with OCI & Helm Slides.pdf
Weaveworks
 
PDF
Flux Security & Scalability using VS Code GitOps Extension
Weaveworks
 
PDF
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps
Weaveworks
 
Weave AI Controllers (Weave GitOps Office Hours)
Weaveworks
 
Flamingo: Expand ArgoCD with Flux (Office Hours)
Weaveworks
 
Webinar: Capabilities, Confidence and Community – What Flux GA Means for You
Weaveworks
 
Six Signs You Need Platform Engineering
Weaveworks
 
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
Weaveworks
 
Webinar: End to End Security & Operations with Chainguard and Weave GitOps
Weaveworks
 
Flux Beyond Git Harnessing the Power of OCI
Weaveworks
 
Automated Provisioning, Management & Cost Control for Kubernetes Clusters
Weaveworks
 
How to Avoid Kubernetes Multi-tenancy Catastrophes
Weaveworks
 
Building internal developer platform with EKS and GitOps
Weaveworks
 
GitOps Testing in Kubernetes with Flux and Testkube.pdf
Weaveworks
 
Intro to GitOps with Weave GitOps, Flagger and Linkerd
Weaveworks
 
Implementing Flux for Scale with Soft Multi-tenancy
Weaveworks
 
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKS
Weaveworks
 
The Story of Flux Reaching Graduation in the CNCF
Weaveworks
 
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...
Weaveworks
 
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...
Weaveworks
 
Flux’s Security & Scalability with OCI & Helm Slides.pdf
Weaveworks
 
Flux Security & Scalability using VS Code GitOps Extension
Weaveworks
 
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps
Weaveworks
 
Ad

Recently uploaded (20)

PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
July Patch Tuesday
Ivanti
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
July Patch Tuesday
Ivanti
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
Biography of Daniel Podor.pdf
Daniel Podor
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 

Using MLOps to Bring ML to Production/The Promise of MLOps

  • 2. Foundation for ML Your data + Microsoft data Breakthrough advancements Data Cloud Models Power of Azure
  • 3. SpeechVision Language 2016 2017 20182018 Microsoft ML breakthroughs
  • 4. Microsoft 365 ML at Microsoft | Research
  • 5. ML at scale Monthly active Office 365 users using AI 180 million Questions Asked of Cortana 18 Billion Number of Signals Analyzed to Block Emerging Threats DAILY 6.5 Trillion
  • 6. But ML is HARD!
  • 8. Building a model Data ingestion Data analysis Data transformation Data validation Data splitting Trainer Model validation Training at scale LoggingRoll-out Serving Monitoring
  • 9. Ok, but, like, I’m a data scientist. IDGAF I don’t care about all that.
  • 11. 11
  • 12. Cowboys and Ranchers Can Be Friends! SRE/ML EngineersData Scientist • Quick iteration • Frameworks they understand • Best of breed tools • No management headaches • Unlimited scale • Reuse of tooling and platforms • Corporate compliance • Observability • Uptime
  • 13. Haven’t I Heard This Before?
  • 14. GitOps = Git + Dev + Ops
  • 17. MLOps = ML + DEV + OPS Experiment Data Acquisition Business Understanding Initial Modeling Develop Modeling Operate Continuous Delivery Data Feedback Loop System + Model Monitoring ML + Testing Continuous Integration Continuous Deployment
  • 18. MLOps Benefits • Code drives generation and deployments • Pipelines are reproducible and verifiable • All artifacts can be tagged and audited • SWE best practices for quality control • Offline comparisons of model quality • Minimize bias and enable explainability • Controlled rollout capabilities • Live comparison of predicted vs. expected performance • Results fed back to watch for drift and improve model Automation / Observability Validation Reproducibility /Auditability == VELOCITY and SECURITY (For ML)
  • 19. Internal MLOps Platforms FBLearner FlowTensorFlow Extended Uber’s Michelangelo Microsoft Aether
  • 20. But I Don’t Work at a Big Company With Thousands of ML Engineers!
  • 21. Build Your Own MLOps Platform And many MANY more… + +
  • 23. Real World Multi-Cloud CI/CD Pipeline Process Train Stage Serve Data Distributed Cloud SRE/ML Engineers Data Scientist ENV #1 ENV #2
  • 24. Azure DevOps Pipelines Cloud-hosted pipelines for Linux, Windows and macOS. Any language, any platform, any cloud Build, test, and deploy Node.js, Python, 
Java, PHP, Ruby, C/C++, .NET, Android, and iOS apps. Run in parallel on Linux, macOS, and Windows. Deploy to Azure, AWS, GCP or on-premises Extensible Explore and implement a wide range of community- built build, test, and deployment tasks, along with hundreds of extensions from Slack to SonarCloud. Support for YAML, reporting and more Containers and Kubernetes Easily build and push images to container registries like Docker Hub and Azure Container Registry. Deploy containers to individual hosts or Kubernetes.
  • 25. Azure DevOps + Azure ML
  • 26. First Class Model Training Tasks CI pipeline captures: 1. Create sandbox 2. Run unit tests and code quality checks 3. Attach to compute 4. Run training pipeline 5. Evaluate model 6. Register model
  • 27. Automated Deployment CD pipeline captures: 1. Package model into container image 2. Validate and profile model 3. Deploy model to DevTest (ACI) 4. If all is well, proceed to rollout to AKS Everything is done via the CLI
  • 28. Model Versioning & Storage • which data, • which experiment / previous model(s), • where’s the code / notebook) • Was it converted / quantized? • Private / compliant data
  • 29. Model Validation • Data (changes to shape / profile) • Model in isolation (offline A/B) • Model + app (functional testing) • Only deploy after initial validation passes • Ramp up traffic to new model using A/B experimentations • Functional behavior • Performance characteristics
  • 31. Model Deployment • Focus on ML, not DevOps • Get telemetry for service health and model behavior • code-generation • API specifications / interfaces • Cloud Services • Mobile / Embedded Applications • Edge Devices • Quantize / optimize models for target platform • Compliant + Safe
  • 32. Seems Like a Lot of Work…
  • 33. 33
  • 34. MLOps Gets You to Production • End-to-end ownership by data science teams using SWE best practices • Continuously deliver of value to end users. • Enables lineage, auditability and regulatory compliance through consistency
  • 36. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 37. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 39. Does My Model Actually Work? SRE/ML EngineersData Scientist Time to test out my model… Laptop The Cloud
  • 40. Does My Model Actually Work? SRE/ML EngineersData Scientist Laptop The Cloud
  • 41. Does My Model Actually Work? SRE/ML EngineersData Scientist Looks good to me! To Production! Laptop The Cloud
  • 42. Does My Model Actually Work? SRE/ML EngineersData Scientist Laptop The Cloud Wait, what? Oh… oh no…
  • 43. Does My Model Actually Work? SRE/ML EngineersData Scientist Laptop The Cloud WOAH there.
  • 44. Does My Model Actually Work? SRE/ML EngineersData Scientist Laptop The Cloud WOAH there. Source Control
  • 45. What is happening… Source Control Does My Model Actually Work? SRE/ML EngineersData Scientist Laptop The Cloud
  • 46. A Small Example of Issues You Can Have… • Inappropriate HW/SW stack • Mismatched driver versions • Crash looping deployment • Data/model versioning [Nick Walsh] • Non-standard images/OS version • Pre-processing code doesn’t match production pre-processing • Production data doesn’t match training/test data • Output of the model doesn’t match application expectations • Hand-coded heuristics better than model [Adam Laiacano] • Model freshness (train on out-of-date data/input shape changed) • Test/production statistics/population shape skew • Overfitting on training/test data • Bias introduction (or not tested) • Over/under HW provisioning • Latency issues • Permissions/certs • Failure to obey health checks • Killed production model before roll out of new/in wrong order • Thundering herd for new model • Logging to the wrong location • Storage for model not allocated properly/accessible by deployment tooling • Route to artifacts not available for download • API signature changes not propagated/expected • Cross-data center latency • Expected benefit doesn’t materialize (e.g. multiple components in the app change simultaneously) • Get wrong/no traffic because A/B config didn’t roll out • Get too much traffic too soon (expected to canary/exponential roll out) • Lack of visibility into real-time model behavior (detecting data drift, live data distribution vs train data, etc) [Nick Walsh] • Outliers not predicted [MikeBSilverman] • Change was a good change, but didn’t communicate with the rest of the team (so you must roll back) • No dates! (date to measure impact/improvement against a pre- agreed measure; date scheduled to assess data changes) [Mary Branscombe] • No CI/CD; manual changes untracked [Jon Peck] • LACK OF DOCUMENTATION!! (the problem, the testing, the solution, lots more) [Terry Christiani] • Successful model causes pain elsewhere in the organization (e.g. detecting faults previously missed) [Mark Round] Or It Just Doesn’t Work! At All!
  • 47. Does My Model Actually Work? SRE/ML EngineersData Scientist Laptop The Cloud Source Control Automated Validation & Profiling Package For Rollout Explain Model & Look for Bias Clean/ Minimize Code Sane Deployment Nice. Nice. ü
  • 48. But I Can Do All These Manually…
  • 49. No.
  • 50. MLOps is a Platform and a Philosophy Even if: o Every data scientist trained... o And you had all the tools necessary... o And they all worked together... o And your SREs understood ML modeling... o And and and and ... You’d still need a permenant, repeatble record of what you did
  • 52. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 53. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 55. What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer I’d Like a loan, please. Source Control
  • 56. What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer No. Source Control
  • 57. What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer Ok, but why? Source Control
  • 58. Source Control What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer Uh oh. Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer Lawyer LawyerLawyer
  • 59. It’s Not Just About Explainability! • Yes, models are complicated • But, that’s not enough: o What data did you train on? o How did you transform/exclude outliers? o What are the data statistics? o Did anything change between code and production? o What model did you actually serve (to this person)? • MLOps can help!
  • 60. What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer Source Control Automated Validation & Profiling Package For Rollout Explain Model & Look for Bias Clean/ Minimize Code Sane Deployment
  • 61. 32c04681d7573 What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer Automated Validation & Profiling Package For Rollout Explain Model & Look for Bias Clean/ Minimize Code Sane Deployment Source Control Immutable Metadata Store b151f8e65b32a c7f4e7607b4b7 0ef1d58921d89 e2e1e994c4251 786c8e57a6d51 9ce88802f0759 9ce88802f0759
  • 62. What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer Automated Validation & Profiling Package For Rollout Explain Model & Look for Bias Clean/ Minimize Code Sane Deployment Source Control Immutable Metadata Store b151f8e65b32a c7f4e7607b4b7 0ef1d58921d89 e2e1e994c4251 786c8e57a6d51 9ce88802f0759 32c04681d7573 Why didn’t I get a loan? 9ce88802f0759
  • 63. What Did My Customers See? SRE/ML Engineers The Cloud Front End Model Server Customer Automated Validation & Profiling Package For Rollout Explain Model & Look for Bias Clean/ Minimize Code Sane Deployment Source Control Immutable Metadata Store b151f8e65b32a c7f4e7607b4b7 0ef1d58921d89 e2e1e994c4251 786c8e57a6d51 9ce88802f0759 32c04681d7573 32c04681d7573 9ce88802f0759
  • 64. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 65. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 67. Is My Model Still Good? SRE/ML Engineers The Cloud There is a blue or orange DUCK inside this barn. What color is the duck?
  • 69. Is My Model Still Good? SRE/ML Engineers The Cloud Front End Model Server f7c5f9fe7b762 It’s a duck! BLUE There is a blue or orange DUCK inside this barn. What color is the duck?
  • 71. Is My Model Still Good? SRE/ML Engineers The Cloud Front End Model Server f7c5f9fe7b762 It’s a duck! BLUE 5 Blue Ducks 995 Yellow Ducks Accuracy = 99% False Positive = 1% ???????????????????
  • 73. 𝑷 𝑨| 𝑩 = 𝑷 𝑩| 𝑨 ⋅ 𝑷 𝑨 𝑷 𝑩 Bayes’ Theorem
  • 74. Accuracy depends on the population distribution!
  • 75. Is My Model Still Good? SRE/ML Engineers The Cloud Front End Model Server f7c5f9fe7b762 It’s a duck! BLUE 995 Yellow Ducks 5 Blue Ducks WRONG 2/3rd of the Time! Accuracy = 99% False Positive = 1% ???????????????????
  • 78. Is My Model Still Good? SRE/ML Engineers The Cloud Front End Model Server f7c5f9fe7b762 It’s a duck! BLUE 995 Yellow Ducks 5 Blue Ducks Model Server d4093cc84b267
  • 80. Is My Model Still Good? SRE/ML Engineers The Cloud Front End Model Server 995 Yellow Ducks 5 Blue Ducks d4093cc84b267
  • 81. Is My Model Still Good? SRE/ML Engineers The Cloud Front End Model Server 500 Yellow Ducks 500 Blue Ducks d4093cc84b267
  • 82. Is My Model Still Good? • Models != Code – they can go stale... QUICKLY. • IMPORTANT: o Watch your model & data for drift from training o Regularly (if not continuously) retrain, even before performance begins to fail o Multiple versions rollbacks are not uncommon! • Without an e2e MLOps pipeline, many of the above are O(really really hard)!
  • 83. What Does All This Stuff Solve For? 1. Does My Model Actually Work? 2. What Did My Customers See? 3. Is My Model Still Good?
  • 85. MLOps Gives* You… • Software best practices for building machine learning solutions • Repeatable workflow for training a model and rolling it out to production • An immutable record of what’s actually running • Lineage of model creation including data sources • Acceleration from code to customer benefits * Requires some human and software work
  • 86. What’s Next for MLOps • Simplify monitoring and retraining • Extend MLOps for data incl prep and profiling • Enterprise features o Test cases o Auditing o Security o Resource management (bin packing / resource optimization) o Network isolation • Metadata and API standards Or, better yet, you tell us!
  • 87. It’s a whole new world • Data science will touch EVERY industry. • We can’t ask people to become a PhD in statistics though. • How do WE help everyone take advantage of this transformation?
  • 88. me: David Aronchick ([email protected]) twitter: @aronchick github: • https://github.com/aronchick/kubeflow-and-mlops • https://aka.ms/mlops THANK YOU!