SlideShare a Scribd company logo
Experimentation to
Industrialization:
Implementing MLOps
Deb Lee & Al McEwan
Thorogood Associates
About Us
Deb Lee
SENIOR CONSULTANT
MLOps Practice Lead
deb.lee@thorogood.com
Al McEwan
PRINCIPAL CONSULTANT
Solutions Architect, Databricks Champion, Global
Head of Capability Development
al.mcewan@thorogood.com
Independent, Specialist Data & AI Consultancy
US • UK • Singapore • Brazil • India
Databricks Partner Since 2018
www.thorogood.com
Data Science MLOps & DevOps
Data Engineering Data Visualization
Agenda
ØContext
ØCase Study
ØKey Learnings &
Takeaways
§ People
§ Processes
§ Tools
Companies are investing heavily in ML and AI
Thinking must shift to embrace operationalization
• Sandbox environments
• Ad Hoc, Exploratory
• Low Commitment
• Familiar tool for the data
scientist
• Done locally or in non-
integrated environments
EXPERIMENTAL
• Automated
• Integrated
• Reusable
• Scalable
• Understood and trusted
• Cost efficient
• Ongoing experiments
OPERATIONAL
• Sandbox environments
• Ad Hoc, Exploratory
• Low Commitment
• Familiar tool for the data
scientist
• Done locally or in non-
integrated environments
EXPERIMENTAL
• Automated
• Integrated
• Reusable
• Scalable
• Understood and trusted
• Cost efficient
• Ongoing experiments
OPERATIONAL
• Sandbox environments
• Ad Hoc, Exploratory
• Low Commitment
• Familiar tool for the data
scientist
• Done locally or in non-
integrated environments
EXPERIMENTAL
• Automated
• Integrated
• Reusable
• Scalable
• Understood and trusted
• Cost efficient
• Ongoing experiments
OPERATIONAL
MLOps
Key Benefits of MLOps
SCALABILITY
Ability to scale horizontally and vertically,
consumption efficiencies from running data
engineering and data science at-scale
MODEL EVALUATION
Maintain and monitor model quality using standardized &
consolidated custom KPIs and model evaluation metrics
FAST FEEDBACK LOOP
Respond to business opportunities and changes
quickly, incorporate enhancements to product on
regular basis
REUSABLE ASSETS
Track, monitor, and identify reusable assets
(registered models, datasets, pipelines) to
increase efficiency & cost savings
MODEL TRACEABILITY
Create traceability & wider auditability using enterprise
model registries, experiment tracking, and monitoring
operations for greater observability
AUTOMATED MODEL TRAINING
Decrease manual dependencies using pipelines
configured to kick off automated retraining based on
defined triggers
REPRODUCIBILITY
Save time & create governance for product teams
by using tools that enable reproducibility of
experiments and model training
VERSION SECURITY & COMPATABILITY
Maintain security by using licensed packages on
tested versions, keep OS versions of clusters up to
date, keep all libraries and packages up to date
Case Study
Establishing a Global MLOps Framework
Customer situation
In order to stay ahead, the customer recognized that a global coordinated
strategy and framework was needed to realize the benefits of MLOps
Investment in experimentation that has proven
valuable
Data science teams work in focused business
areas, following independent practices
Fortune Global 500
Consumer Goods
Company
• 190 countries
• 2.5 billion+ consumers
daily
• 400 brands
Establishing a Global MLOps Framework
Thorogood’s approach
Experimentation
ML models
operationalized
MLOps guidance,
recommendations &
artefacts, project-tested
Creation of reusable
Code & Pipeline
Accelerator templates
Establishing a Global MLOps Framework
Framework impact
REUSABILITY
As more products are onboarded, a central
function will improve reusability of existing assets
and help consolidate models and approaches
used across products.
TIME & COST SAVINGS
Reduce duplicative effort & apply responsible
cloud consumption principles to all projects,
receive cost efficiencies from consolidation of
operations.
SIMPLIFICATION
A centralized function will maintain
adherence to MLOps suggested standards to
simplify toolsets used and improve ways of
working for all teams.
CONTINUOUS IMPROVEMENT
The MLOps service will have dedicated teams
for ongoing operations and one-off activities
such as product enhancements &
industrialization efforts.
SCALABILITY
Enable data science projects to scale up
more quickly, rapidly realize a vision to
unlock business value using data science
in all areas of the organization.
RELIABILITY
Build greater trust and confidence from business
users and data science teams by allowing them
to realize the value of MLOps delivered using a
consistent and high-quality methodology
Customer’s
Global MLOps
Service
Key Takeaways & Learnings
Key Takeaways & Learnings
People Processes Tools
Data
People
Real-world ML Systems
Reference: “Hidden Technical Debt in Machine Learning Systems” by D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips 2015
Configuration
Data Collection
Feature Extraction
ML
Code
Data Verification Machine
Resource
Management
Analysis Tools
Process Management Tools
Serving
Infrastructure
Monitoring
People
MLOps Requires Data Scientists who understand both Scale and Reproducibility
ML Code: could be relatively small, but key to success
Data Scientist skillset specialization
Training in making code scalable, efficient and reproducible
People
Blend of capabilities and skills needed depends on the engagement
Machine Learning Engineering
Data Science
Data Engineering
Data Visualization
Solution Architect
Program Management
Scenario 1
Operationalization of a
use case requiring:
• Real-time model
serving capabilities
• Web application
interface and backend
• Creation of data
engineering and data
science pipelines
• Scripted management
and versioning of
compute, datastore,
datasets, pipelines
Scenario 2
Continuous improvements
to baseline monitoring
operations requiring:
• Create automatically
refreshed monitoring
dashboards
• Enhance tracking of
and reporting on drift
and other scoring
metrics alongside
experiment tracking
• Design for various
target audiences: data
scientists, ML support
engineers, business
users
Processes
Key Takeaways & Learnings – Artefacts Created
There are a number of moving parts and handshakes needed for a centralized MLOps service to function and teams to be in sync.
Without a defined framework and process, it’s hard to be successful.
QUESTIONNAIRE
Used to qualify use
cases & projects in the
pipeline for
onboarding to MLOps
service
ML TEST SCORE
Measures the overall
readiness of the ML
system for production
DECISION TREE
For anyone embarking on a
data science project, guide on
tools to use considering
training volumes, libraries,
serving method,
parallelization, retraining
frequency
PLAYBOOK
Guidelines for
experimentation and
operationalization to
streamline the MLOps
process
REPRODUCIBILITY
CHECKLIST
Requires code versioning,
data versioning, model
versioning in model
registry, cluster
configuration, environment
specification
Reference: “The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction” by E.Breck et al. 2017
Tools
Decision trees helping to guide tool selection at critical junctures
How many models are being
built?
A large model spanning the
entire business
One model per dimension (i.e.
per product)
We recommend use of Spark’s
MLLib if model is trained on a
big dataset (>0.5GB)
We recommended use of
Spark’s MLLib if cross-
validation scenarios exist
Non-Spark options can be
considered for smaller training
datasets
Non-Spark options can be
considered for this scenario
Training & Evaluation
Orchestration
Deployment
Tracking
Experimentation Initial Industrialization
Model Monitoring &
Enhancements
Considerations:
Decision Points:
…
…
…
…
…
…
…
…
Tools
Databricks is Optimally Positioned to Support MLOps
Databricks Spark: Optimized for
large training data volumes per
model
Best-in-class and widely used for
data science experiments
Multi-Cloud ready:
available on Azure, AWS, and GCP
Unifies requisite data engineering &
data science capabilities with in-built
functions
MLFlow provides a powerful platform
to manage the ML lifecycle
Integrated with serving and reporting
technologies
How to get started
Ø Assess your current state
Ø Define your target state
Ø Refine your approach to People, Tools and Processes
Ø Educate yourself on the ‘art of the possible’
• Check out our MLOps Resource Hub for useful content at www.thorogood.com
• Most importantly, please reach out to us with any questions or feedback on this topic
CONTACT US
Deb Lee
deb.lee@thorogood.com
Al McEwan
al.mcewan@thorogood.com
Feedback
Your feedback is important to us.
Don’t forget to rate and review the sessions.

More Related Content

What's hot (20)

PDF
Introdution to Dataops and AIOps (or MLOps)
Adrien Blind
 
PDF
Data Mesh
Piethein Strengholt
 
PPTX
Snowflake Overview
Snowflake Computing
 
PDF
Using MLOps to Bring ML to Production/The Promise of MLOps
Weaveworks
 
PDF
MLOps Virtual Event: Automating ML at Scale
Databricks
 
PPTX
From Data Science to MLOps
Carl W. Handlin
 
PDF
Data Lake Architecture
DATAVERSITY
 
PDF
MLOps for production-level machine learning
cnvrg.io AI OS - Hands-on ML Workshops
 
PPTX
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
Databricks
 
PDF
MLops workshop AWS
Gili Nachum
 
PDF
Training Week: Introduction to Neo4j
Neo4j
 
PPTX
Azure data platform overview
James Serra
 
PDF
Time to Talk about Data Mesh
LibbySchulze
 
PPTX
230208 MLOps Getting from Good to Great.pptx
Arthur240715
 
PDF
Vertex AI: Pipelines for your MLOps workflows
Márton Kodok
 
PPTX
Mainframe Modernization with Precisely and Microsoft Azure
Precisely
 
PDF
Data Mesh for Dinner
Kent Graziano
 
PPTX
Free Training: How to Build a Lakehouse
Databricks
 
PPTX
MLOps.pptx
AllenPeter7
 
PPTX
Databricks for Dummies
Rodney Joyce
 
Introdution to Dataops and AIOps (or MLOps)
Adrien Blind
 
Snowflake Overview
Snowflake Computing
 
Using MLOps to Bring ML to Production/The Promise of MLOps
Weaveworks
 
MLOps Virtual Event: Automating ML at Scale
Databricks
 
From Data Science to MLOps
Carl W. Handlin
 
Data Lake Architecture
DATAVERSITY
 
MLOps for production-level machine learning
cnvrg.io AI OS - Hands-on ML Workshops
 
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
Databricks
 
MLops workshop AWS
Gili Nachum
 
Training Week: Introduction to Neo4j
Neo4j
 
Azure data platform overview
James Serra
 
Time to Talk about Data Mesh
LibbySchulze
 
230208 MLOps Getting from Good to Great.pptx
Arthur240715
 
Vertex AI: Pipelines for your MLOps workflows
Márton Kodok
 
Mainframe Modernization with Precisely and Microsoft Azure
Precisely
 
Data Mesh for Dinner
Kent Graziano
 
Free Training: How to Build a Lakehouse
Databricks
 
MLOps.pptx
AllenPeter7
 
Databricks for Dummies
Rodney Joyce
 

Similar to Experimentation to Industrialization: Implementing MLOps (20)

PDF
MLOps Bridging the gap between Data Scientists and Ops.
Knoldus Inc.
 
PDF
Introducing MLOps.pdf
Dr. Anish Cheriyan (PhD)
 
PDF
solulab.com-MLOps Consulting Services.pdf
celinedion89121
 
PDF
MLOps Consulting Services| MLOps developmemt - SoluLab
SoluLab1231
 
PDF
-MLOps Consulting Services | MLOps Services - SoluLab
SoluLab1231
 
PDF
What is Machine Learning Operations (MLOps)?
Leonardo Moraes
 
PDF
artificggggggggggggggialintelligence.pdf
tt4765690
 
PDF
What is MLOps - Complete Guide for Beginners
Evision
 
PDF
Fundamental MLOps
Saripudin Gon
 
PDF
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Databricks
 
PDF
MLOps – Applying DevOps to Competitive Advantage
DATAVERSITY
 
PDF
Demystifying MLOps: A Beginner's Guide To Machine Learning Operations
Rahul Bedi
 
PDF
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Lviv Startup Club
 
PDF
How to Build an MLOps Pipeline - SoluLab
SoluLab1231
 
PDF
Revolutionizing Software Development: The Power of MLOps!
Veritis Group, Inc
 
PDF
Key to a Smarter Future Leverage MLOps to scale AI ML.pdf
Mindfire LLC
 
PPTX
MLOps.pptx
sundharakumarkb1
 
PDF
Unlock the power of MLOps.pdf
StephenAmell4
 
PDF
MLOPS By Amazon offered and free download
pouyan533
 
PDF
Building successful and secure products with AI and ML
Simon Lia-Jonassen
 
MLOps Bridging the gap between Data Scientists and Ops.
Knoldus Inc.
 
Introducing MLOps.pdf
Dr. Anish Cheriyan (PhD)
 
solulab.com-MLOps Consulting Services.pdf
celinedion89121
 
MLOps Consulting Services| MLOps developmemt - SoluLab
SoluLab1231
 
-MLOps Consulting Services | MLOps Services - SoluLab
SoluLab1231
 
What is Machine Learning Operations (MLOps)?
Leonardo Moraes
 
artificggggggggggggggialintelligence.pdf
tt4765690
 
What is MLOps - Complete Guide for Beginners
Evision
 
Fundamental MLOps
Saripudin Gon
 
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Databricks
 
MLOps – Applying DevOps to Competitive Advantage
DATAVERSITY
 
Demystifying MLOps: A Beginner's Guide To Machine Learning Operations
Rahul Bedi
 
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Lviv Startup Club
 
How to Build an MLOps Pipeline - SoluLab
SoluLab1231
 
Revolutionizing Software Development: The Power of MLOps!
Veritis Group, Inc
 
Key to a Smarter Future Leverage MLOps to scale AI ML.pdf
Mindfire LLC
 
MLOps.pptx
sundharakumarkb1
 
Unlock the power of MLOps.pdf
StephenAmell4
 
MLOPS By Amazon offered and free download
pouyan533
 
Building successful and secure products with AI and ML
Simon Lia-Jonassen
 
Ad

More from Databricks (20)

PPTX
DW Migration Webinar-March 2022.pptx
Databricks
 
PPTX
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
PPT
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
PPTX
Data Lakehouse Symposium | Day 2
Databricks
 
PPTX
Data Lakehouse Symposium | Day 4
Databricks
 
PDF
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
PDF
Democratizing Data Quality Through a Centralized Platform
Databricks
 
PDF
Learn to Use Databricks for Data Science
Databricks
 
PDF
Why APM Is Not the Same As ML Monitoring
Databricks
 
PDF
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
PDF
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
PDF
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
PDF
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
PDF
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
PDF
Sawtooth Windows for Feature Aggregations
Databricks
 
PDF
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
PDF
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
PDF
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
PDF
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
PDF
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
Ad

Recently uploaded (20)

PDF
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
PDF
apidays Singapore 2025 - Streaming Lakehouse with Kafka, Flink and Iceberg by...
apidays
 
PDF
apidays Singapore 2025 - Trustworthy Generative AI: The Role of Observability...
apidays
 
PDF
Using AI/ML for Space Biology Research
VICTOR MAESTRE RAMIREZ
 
PDF
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
PPT
tuberculosiship-2106031cyyfuftufufufivifviviv
AkshaiRam
 
PPT
AI Future trends and opportunities_oct7v1.ppt
SHIKHAKMEHTA
 
PPTX
apidays Singapore 2025 - Generative AI Landscape Building a Modern Data Strat...
apidays
 
PDF
Simplifying Document Processing with Docling for AI Applications.pdf
Tamanna
 
PDF
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 
PDF
Research Methodology Overview Introduction
ayeshagul29594
 
PPTX
apidays Helsinki & North 2025 - From Chaos to Clarity: Designing (AI-Ready) A...
apidays
 
PPTX
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
PDF
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
PPTX
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
PDF
Development and validation of the Japanese version of the Organizational Matt...
Yoga Tokuyoshi
 
PDF
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
PPTX
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
PDF
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
PPTX
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
apidays Singapore 2025 - Streaming Lakehouse with Kafka, Flink and Iceberg by...
apidays
 
apidays Singapore 2025 - Trustworthy Generative AI: The Role of Observability...
apidays
 
Using AI/ML for Space Biology Research
VICTOR MAESTRE RAMIREZ
 
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
tuberculosiship-2106031cyyfuftufufufivifviviv
AkshaiRam
 
AI Future trends and opportunities_oct7v1.ppt
SHIKHAKMEHTA
 
apidays Singapore 2025 - Generative AI Landscape Building a Modern Data Strat...
apidays
 
Simplifying Document Processing with Docling for AI Applications.pdf
Tamanna
 
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 
Research Methodology Overview Introduction
ayeshagul29594
 
apidays Helsinki & North 2025 - From Chaos to Clarity: Designing (AI-Ready) A...
apidays
 
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
Development and validation of the Japanese version of the Organizational Matt...
Yoga Tokuyoshi
 
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 

Experimentation to Industrialization: Implementing MLOps

  • 2. About Us Deb Lee SENIOR CONSULTANT MLOps Practice Lead [email protected] Al McEwan PRINCIPAL CONSULTANT Solutions Architect, Databricks Champion, Global Head of Capability Development [email protected] Independent, Specialist Data & AI Consultancy US • UK • Singapore • Brazil • India Databricks Partner Since 2018 www.thorogood.com Data Science MLOps & DevOps Data Engineering Data Visualization
  • 3. Agenda ØContext ØCase Study ØKey Learnings & Takeaways § People § Processes § Tools
  • 4. Companies are investing heavily in ML and AI
  • 5. Thinking must shift to embrace operationalization • Sandbox environments • Ad Hoc, Exploratory • Low Commitment • Familiar tool for the data scientist • Done locally or in non- integrated environments EXPERIMENTAL • Automated • Integrated • Reusable • Scalable • Understood and trusted • Cost efficient • Ongoing experiments OPERATIONAL • Sandbox environments • Ad Hoc, Exploratory • Low Commitment • Familiar tool for the data scientist • Done locally or in non- integrated environments EXPERIMENTAL • Automated • Integrated • Reusable • Scalable • Understood and trusted • Cost efficient • Ongoing experiments OPERATIONAL • Sandbox environments • Ad Hoc, Exploratory • Low Commitment • Familiar tool for the data scientist • Done locally or in non- integrated environments EXPERIMENTAL • Automated • Integrated • Reusable • Scalable • Understood and trusted • Cost efficient • Ongoing experiments OPERATIONAL MLOps
  • 6. Key Benefits of MLOps SCALABILITY Ability to scale horizontally and vertically, consumption efficiencies from running data engineering and data science at-scale MODEL EVALUATION Maintain and monitor model quality using standardized & consolidated custom KPIs and model evaluation metrics FAST FEEDBACK LOOP Respond to business opportunities and changes quickly, incorporate enhancements to product on regular basis REUSABLE ASSETS Track, monitor, and identify reusable assets (registered models, datasets, pipelines) to increase efficiency & cost savings MODEL TRACEABILITY Create traceability & wider auditability using enterprise model registries, experiment tracking, and monitoring operations for greater observability AUTOMATED MODEL TRAINING Decrease manual dependencies using pipelines configured to kick off automated retraining based on defined triggers REPRODUCIBILITY Save time & create governance for product teams by using tools that enable reproducibility of experiments and model training VERSION SECURITY & COMPATABILITY Maintain security by using licensed packages on tested versions, keep OS versions of clusters up to date, keep all libraries and packages up to date
  • 8. Establishing a Global MLOps Framework Customer situation In order to stay ahead, the customer recognized that a global coordinated strategy and framework was needed to realize the benefits of MLOps Investment in experimentation that has proven valuable Data science teams work in focused business areas, following independent practices Fortune Global 500 Consumer Goods Company • 190 countries • 2.5 billion+ consumers daily • 400 brands
  • 9. Establishing a Global MLOps Framework Thorogood’s approach Experimentation ML models operationalized MLOps guidance, recommendations & artefacts, project-tested Creation of reusable Code & Pipeline Accelerator templates
  • 10. Establishing a Global MLOps Framework Framework impact REUSABILITY As more products are onboarded, a central function will improve reusability of existing assets and help consolidate models and approaches used across products. TIME & COST SAVINGS Reduce duplicative effort & apply responsible cloud consumption principles to all projects, receive cost efficiencies from consolidation of operations. SIMPLIFICATION A centralized function will maintain adherence to MLOps suggested standards to simplify toolsets used and improve ways of working for all teams. CONTINUOUS IMPROVEMENT The MLOps service will have dedicated teams for ongoing operations and one-off activities such as product enhancements & industrialization efforts. SCALABILITY Enable data science projects to scale up more quickly, rapidly realize a vision to unlock business value using data science in all areas of the organization. RELIABILITY Build greater trust and confidence from business users and data science teams by allowing them to realize the value of MLOps delivered using a consistent and high-quality methodology Customer’s Global MLOps Service
  • 11. Key Takeaways & Learnings
  • 12. Key Takeaways & Learnings People Processes Tools Data
  • 13. People Real-world ML Systems Reference: “Hidden Technical Debt in Machine Learning Systems” by D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips 2015 Configuration Data Collection Feature Extraction ML Code Data Verification Machine Resource Management Analysis Tools Process Management Tools Serving Infrastructure Monitoring
  • 14. People MLOps Requires Data Scientists who understand both Scale and Reproducibility ML Code: could be relatively small, but key to success Data Scientist skillset specialization Training in making code scalable, efficient and reproducible
  • 15. People Blend of capabilities and skills needed depends on the engagement Machine Learning Engineering Data Science Data Engineering Data Visualization Solution Architect Program Management Scenario 1 Operationalization of a use case requiring: • Real-time model serving capabilities • Web application interface and backend • Creation of data engineering and data science pipelines • Scripted management and versioning of compute, datastore, datasets, pipelines Scenario 2 Continuous improvements to baseline monitoring operations requiring: • Create automatically refreshed monitoring dashboards • Enhance tracking of and reporting on drift and other scoring metrics alongside experiment tracking • Design for various target audiences: data scientists, ML support engineers, business users
  • 16. Processes Key Takeaways & Learnings – Artefacts Created There are a number of moving parts and handshakes needed for a centralized MLOps service to function and teams to be in sync. Without a defined framework and process, it’s hard to be successful. QUESTIONNAIRE Used to qualify use cases & projects in the pipeline for onboarding to MLOps service ML TEST SCORE Measures the overall readiness of the ML system for production DECISION TREE For anyone embarking on a data science project, guide on tools to use considering training volumes, libraries, serving method, parallelization, retraining frequency PLAYBOOK Guidelines for experimentation and operationalization to streamline the MLOps process REPRODUCIBILITY CHECKLIST Requires code versioning, data versioning, model versioning in model registry, cluster configuration, environment specification Reference: “The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction” by E.Breck et al. 2017
  • 17. Tools Decision trees helping to guide tool selection at critical junctures How many models are being built? A large model spanning the entire business One model per dimension (i.e. per product) We recommend use of Spark’s MLLib if model is trained on a big dataset (>0.5GB) We recommended use of Spark’s MLLib if cross- validation scenarios exist Non-Spark options can be considered for smaller training datasets Non-Spark options can be considered for this scenario Training & Evaluation Orchestration Deployment Tracking Experimentation Initial Industrialization Model Monitoring & Enhancements Considerations: Decision Points: … … … … … … … …
  • 18. Tools Databricks is Optimally Positioned to Support MLOps Databricks Spark: Optimized for large training data volumes per model Best-in-class and widely used for data science experiments Multi-Cloud ready: available on Azure, AWS, and GCP Unifies requisite data engineering & data science capabilities with in-built functions MLFlow provides a powerful platform to manage the ML lifecycle Integrated with serving and reporting technologies
  • 19. How to get started Ø Assess your current state Ø Define your target state Ø Refine your approach to People, Tools and Processes Ø Educate yourself on the ‘art of the possible’ • Check out our MLOps Resource Hub for useful content at www.thorogood.com • Most importantly, please reach out to us with any questions or feedback on this topic CONTACT US Deb Lee [email protected] Al McEwan [email protected]
  • 20. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.