SlideShare a Scribd company logo
10
Most read
12
Most read
16
Most read
Advanced MLflow
Matei Zaharia, Sid Murching, Tomas Nykodym
September 20th, 2018
1
What is MLflow?
Tracking
Record and query
experiments: code,
data, config, results
Projects
Packaging format
for reproducible runs
on any platform
Models
General model format
that supports diverse
deployment tools
Open source platform to accelerate the ML lifecycle
What’s New in MLflow?
Many updates since our last meetup at MLflow 0.2.1!
• Java API
• PyTorch, Keras, H2O, GCS, SFTP integrations
• APIs for tags & querying run history
• Optimized Spark ML serving with MLeap
• UX improvements
3
What’s New in MLflow?
4
What’s New in MLflow?
5
Ongoing Development
R API for MLflow (led by RStudio)
• See Javier’s talk later in this meetup!
CRUD UI and APIs (naming, annotating & deleting runs)
Improved experiment view and search UX
6
Learning More About MLflow
pip install mlflow to get started
Find docs & examples at mlflow.org
tinyurl.com/mlflow-slack
7
This Meetup
Go beyond the basics to show how to use MLflow’s components
for complex ML workflows
• Multi-step workflows with caching
• Hyperparameter tuning
8
Multistep ML Workflows
Project Spec
Code DataConfig
Local Execution
Remote Execution
MLflow Projects
● With projects: parametrized, dependency-agnostic runs of arbitrary code
● Can chain projects into multistep workflows
● Tracking server: source of truth for output of individual steps
11
Multistep Workflows
Can debug & develop steps independently
12
● Find this example at mlflow/examples/multistep_workflow
● MovieLens: given user and movie, predict a rating
Demo
13
Demo
1414
Hyperparameter Optimization
ML Algorithms have many parameters affecting the performance:
● Learning rate, momentum, network layers count and size, …
Parameter Selection Strategies:
● Manual
○ Depends on the data scientists skills, high variance, error prone
● Algorithmic
○ Reduced variance, less bias, lower chance of error
○ Strategies include grid search, random search and model based optimization
15
Model Based Hyperparam Optimization
1. Select “best parameters” based on current model.
Params[n+1] = Select(Model[n])
2. Obtain new data points by training the model with new parameters
Metric[n+1] = Train(Params[n+1])
3. Use new data points to update the model.
Model[n+1] = Update(Model[n], Metric[n+1])
HyperParameter Tuning With MLfLow
17
HyperParam
Search Run
Train Model
Run
mlflow.log_metric()mlflow.get_metric()
Projects
Tracking
Models
mlflow run ... Logged Model
mlflow.log_artifact
MLflow HyperParameter Example
You can find this example at mlflow/examples/hyperparam.
Goal: predict wine quality from measured properties
• data: acidity, sugar content, chlorides, alcohol, ... "
• target: quality score, integer between 3 and 9
• metric: “rmse”
The MLproject has following entry points:
• train - train deep learning model with Keras, has two tunable parameters: learning
rate and momentum
• hyperparam train with random, hyperopt, gpyopt
18
Thank you
19

More Related Content

What's hot (20)

PDF
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
Databricks
 
PDF
"Managing the Complete Machine Learning Lifecycle with MLflow"
Databricks
 
PDF
Mlflow with databricks
Liangjun Jiang
 
PDF
Simplifying Model Management with MLflow
Databricks
 
PDF
MLflow with Databricks
Liangjun Jiang
 
PDF
MLOps with Kubeflow
Saurabh Kaushik
 
PDF
MLOps for production-level machine learning
cnvrg.io AI OS - Hands-on ML Workshops
 
PDF
Managing the Complete Machine Learning Lifecycle with MLflow
Databricks
 
PDF
MLOps Bridging the gap between Data Scientists and Ops.
Knoldus Inc.
 
PDF
What is MLOps
Henrik Skogström
 
PDF
Apply MLOps at Scale by H&M
Databricks
 
PPTX
Pythonsevilla2019 - Introduction to MLFlow
Fernando Ortega Gallego
 
PDF
Productionalizing Models through CI/CD Design with MLflow
Databricks
 
PDF
mlflow: Accelerating the End-to-End ML lifecycle
Databricks
 
PDF
Machine Learning using Kubeflow and Kubernetes
Arun Gupta
 
PDF
Using MLOps to Bring ML to Production/The Promise of MLOps
Weaveworks
 
PDF
Ml ops intro session
Avinash Patil
 
PDF
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
DataWorks Summit
 
PPTX
Terraform training 🎒 - Basic
StephaneBoghossian1
 
PDF
Vertex AI: Pipelines for your MLOps workflows
Márton Kodok
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
Databricks
 
"Managing the Complete Machine Learning Lifecycle with MLflow"
Databricks
 
Mlflow with databricks
Liangjun Jiang
 
Simplifying Model Management with MLflow
Databricks
 
MLflow with Databricks
Liangjun Jiang
 
MLOps with Kubeflow
Saurabh Kaushik
 
MLOps for production-level machine learning
cnvrg.io AI OS - Hands-on ML Workshops
 
Managing the Complete Machine Learning Lifecycle with MLflow
Databricks
 
MLOps Bridging the gap between Data Scientists and Ops.
Knoldus Inc.
 
What is MLOps
Henrik Skogström
 
Apply MLOps at Scale by H&M
Databricks
 
Pythonsevilla2019 - Introduction to MLFlow
Fernando Ortega Gallego
 
Productionalizing Models through CI/CD Design with MLflow
Databricks
 
mlflow: Accelerating the End-to-End ML lifecycle
Databricks
 
Machine Learning using Kubeflow and Kubernetes
Arun Gupta
 
Using MLOps to Bring ML to Production/The Promise of MLOps
Weaveworks
 
Ml ops intro session
Avinash Patil
 
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
DataWorks Summit
 
Terraform training 🎒 - Basic
StephaneBoghossian1
 
Vertex AI: Pipelines for your MLOps workflows
Márton Kodok
 

Similar to Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating Custom Libraries (20)

PDF
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
Databricks
 
PDF
Scaling up Machine Learning Development
Matei Zaharia
 
PDF
MLflow-presentation______________________________
fatimaezzahraboumaiz1
 
PDF
Accelerating Production Machine Learning with MLflow with Matei Zaharia
Databricks
 
PDF
Utilisation de MLflow pour le cycle de vie des projet Machine learning
Paris Data Engineers !
 
PDF
Reproducible AI Using PyTorch and MLflow
Databricks
 
PDF
Managing the Machine Learning Lifecycle with MLflow
Databricks
 
PPTX
databricks ml flow demonstration using automatic features engineering
Mohamed MEJDOUBI
 
PDF
What's Next for MLflow in 2019
Anyscale
 
PDF
MLSEV Virtual. From my First BigML Project to Production
BigML, Inc
 
PDF
What are the Unique Challenges and Opportunities in Systems for ML?
Matei Zaharia
 
PDF
Pitfalls of machine learning in production
Antoine Sauray
 
PDF
Scaling Ride-Hailing with Machine Learning on MLflow
Databricks
 
PDF
DutchMLSchool 2022 - Automation
BigML, Inc
 
PPTX
Nasscom ml ops webinar
Sameer Mahajan
 
PDF
MLFlow 1.0 Meetup
Databricks
 
PPTX
MOPs & ML Pipelines on GCP - Session 6, RGDC
gdgsurrey
 
PPTX
Improving How We Deliver Machine Learning Models (XCONF 2019)
David Tan
 
PDF
Reproducible AI Using PyTorch and MLflow
Databricks
 
PDF
MLlib with MLFlow.pdf
MichelleHoogenhout
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
Databricks
 
Scaling up Machine Learning Development
Matei Zaharia
 
MLflow-presentation______________________________
fatimaezzahraboumaiz1
 
Accelerating Production Machine Learning with MLflow with Matei Zaharia
Databricks
 
Utilisation de MLflow pour le cycle de vie des projet Machine learning
Paris Data Engineers !
 
Reproducible AI Using PyTorch and MLflow
Databricks
 
Managing the Machine Learning Lifecycle with MLflow
Databricks
 
databricks ml flow demonstration using automatic features engineering
Mohamed MEJDOUBI
 
What's Next for MLflow in 2019
Anyscale
 
MLSEV Virtual. From my First BigML Project to Production
BigML, Inc
 
What are the Unique Challenges and Opportunities in Systems for ML?
Matei Zaharia
 
Pitfalls of machine learning in production
Antoine Sauray
 
Scaling Ride-Hailing with Machine Learning on MLflow
Databricks
 
DutchMLSchool 2022 - Automation
BigML, Inc
 
Nasscom ml ops webinar
Sameer Mahajan
 
MLFlow 1.0 Meetup
Databricks
 
MOPs & ML Pipelines on GCP - Session 6, RGDC
gdgsurrey
 
Improving How We Deliver Machine Learning Models (XCONF 2019)
David Tan
 
Reproducible AI Using PyTorch and MLflow
Databricks
 
MLlib with MLFlow.pdf
MichelleHoogenhout
 
Ad

More from Databricks (20)

PPTX
DW Migration Webinar-March 2022.pptx
Databricks
 
PPTX
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
PPT
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
PPTX
Data Lakehouse Symposium | Day 2
Databricks
 
PPTX
Data Lakehouse Symposium | Day 4
Databricks
 
PDF
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
PDF
Democratizing Data Quality Through a Centralized Platform
Databricks
 
PDF
Learn to Use Databricks for Data Science
Databricks
 
PDF
Why APM Is Not the Same As ML Monitoring
Databricks
 
PDF
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
PDF
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
PDF
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
PDF
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
PDF
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
PDF
Sawtooth Windows for Feature Aggregations
Databricks
 
PDF
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
PDF
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
PDF
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
PDF
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
PDF
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
Ad

Recently uploaded (20)

DOCX
Import Data Form Excel to Tally Services
Tally xperts
 
PPTX
Tally software_Introduction_Presentation
AditiBansal54083
 
PPTX
MailsDaddy Outlook OST to PST converter.pptx
abhishekdutt366
 
PPTX
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pptx
Varsha Nayak
 
PDF
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
PPTX
Human Resources Information System (HRIS)
Amity University, Patna
 
PDF
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
PPTX
The Role of a PHP Development Company in Modern Web Development
SEO Company for School in Delhi NCR
 
PPTX
Engineering the Java Web Application (MVC)
abhishekoza1981
 
PDF
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
PPTX
Equipment Management Software BIS Safety UK.pptx
BIS Safety Software
 
PDF
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
PDF
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
PDF
Executive Business Intelligence Dashboards
vandeslie24
 
PDF
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
PPTX
How Apagen Empowered an EPC Company with Engineering ERP Software
SatishKumar2651
 
PPTX
Comprehensive Guide: Shoviv Exchange to Office 365 Migration Tool 2025
Shoviv Software
 
PDF
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
PPTX
3uTools Full Crack Free Version Download [Latest] 2025
muhammadgurbazkhan
 
PDF
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
Import Data Form Excel to Tally Services
Tally xperts
 
Tally software_Introduction_Presentation
AditiBansal54083
 
MailsDaddy Outlook OST to PST converter.pptx
abhishekdutt366
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pptx
Varsha Nayak
 
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
Human Resources Information System (HRIS)
Amity University, Patna
 
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
The Role of a PHP Development Company in Modern Web Development
SEO Company for School in Delhi NCR
 
Engineering the Java Web Application (MVC)
abhishekoza1981
 
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
Equipment Management Software BIS Safety UK.pptx
BIS Safety Software
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
Executive Business Intelligence Dashboards
vandeslie24
 
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
How Apagen Empowered an EPC Company with Engineering ERP Software
SatishKumar2651
 
Comprehensive Guide: Shoviv Exchange to Office 365 Migration Tool 2025
Shoviv Software
 
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
3uTools Full Crack Free Version Download [Latest] 2025
muhammadgurbazkhan
 
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 

Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating Custom Libraries

  • 1. Advanced MLflow Matei Zaharia, Sid Murching, Tomas Nykodym September 20th, 2018 1
  • 2. What is MLflow? Tracking Record and query experiments: code, data, config, results Projects Packaging format for reproducible runs on any platform Models General model format that supports diverse deployment tools Open source platform to accelerate the ML lifecycle
  • 3. What’s New in MLflow? Many updates since our last meetup at MLflow 0.2.1! • Java API • PyTorch, Keras, H2O, GCS, SFTP integrations • APIs for tags & querying run history • Optimized Spark ML serving with MLeap • UX improvements 3
  • 4. What’s New in MLflow? 4
  • 5. What’s New in MLflow? 5
  • 6. Ongoing Development R API for MLflow (led by RStudio) • See Javier’s talk later in this meetup! CRUD UI and APIs (naming, annotating & deleting runs) Improved experiment view and search UX 6
  • 7. Learning More About MLflow pip install mlflow to get started Find docs & examples at mlflow.org tinyurl.com/mlflow-slack 7
  • 8. This Meetup Go beyond the basics to show how to use MLflow’s components for complex ML workflows • Multi-step workflows with caching • Hyperparameter tuning 8
  • 10. Project Spec Code DataConfig Local Execution Remote Execution
  • 11. MLflow Projects ● With projects: parametrized, dependency-agnostic runs of arbitrary code ● Can chain projects into multistep workflows ● Tracking server: source of truth for output of individual steps 11
  • 12. Multistep Workflows Can debug & develop steps independently 12
  • 13. ● Find this example at mlflow/examples/multistep_workflow ● MovieLens: given user and movie, predict a rating Demo 13
  • 15. Hyperparameter Optimization ML Algorithms have many parameters affecting the performance: ● Learning rate, momentum, network layers count and size, … Parameter Selection Strategies: ● Manual ○ Depends on the data scientists skills, high variance, error prone ● Algorithmic ○ Reduced variance, less bias, lower chance of error ○ Strategies include grid search, random search and model based optimization 15
  • 16. Model Based Hyperparam Optimization 1. Select “best parameters” based on current model. Params[n+1] = Select(Model[n]) 2. Obtain new data points by training the model with new parameters Metric[n+1] = Train(Params[n+1]) 3. Use new data points to update the model. Model[n+1] = Update(Model[n], Metric[n+1])
  • 17. HyperParameter Tuning With MLfLow 17 HyperParam Search Run Train Model Run mlflow.log_metric()mlflow.get_metric() Projects Tracking Models mlflow run ... Logged Model mlflow.log_artifact
  • 18. MLflow HyperParameter Example You can find this example at mlflow/examples/hyperparam. Goal: predict wine quality from measured properties • data: acidity, sugar content, chlorides, alcohol, ... " • target: quality score, integer between 3 and 9 • metric: “rmse” The MLproject has following entry points: • train - train deep learning model with Keras, has two tunable parameters: learning rate and momentum • hyperparam train with random, hyperopt, gpyopt 18