Advanced MLflow
Matei Zaharia, Sid Murching, Tomas Nykodym
September 20th, 2018
1
What is MLflow?
Tracking
Record and query
experiments: code,
data, config, results
Projects
Packaging format
for reproducible runs
on any platform
Models
General model format
that supports diverse
deployment tools
Open source platform to accelerate the ML lifecycle
What’s New in MLflow?
Many updates since our last meetup at MLflow 0.2.1!
• Java API
• PyTorch, Keras, H2O, GCS, SFTP integrations
• APIs for tags & querying run history
• Optimized Spark ML serving with MLeap
• UX improvements
3
What’s New in MLflow?
4
What’s New in MLflow?
5
Ongoing Development
R API for MLflow (led by RStudio)
• See Javier’s talk later in this meetup!
CRUD UI and APIs (naming, annotating & deleting runs)
Improved experiment view and search UX
6
Learning More About MLflow
pip install mlflow to get started
Find docs & examples at mlflow.org
tinyurl.com/mlflow-slack
7
This Meetup
Go beyond the basics to show how to use MLflow’s components
for complex ML workflows
• Multi-step workflows with caching
• Hyperparameter tuning
8
Multistep ML Workflows
Project Spec
Code DataConfig
Local Execution
Remote Execution
MLflow Projects
● With projects: parametrized, dependency-agnostic runs of arbitrary code
● Can chain projects into multistep workflows
● Tracking server: source of truth for output of individual steps
11
Multistep Workflows
Can debug & develop steps independently
12
● Find this example at mlflow/examples/multistep_workflow
● MovieLens: given user and movie, predict a rating
Demo
13
Demo
1414
Hyperparameter Optimization
ML Algorithms have many parameters affecting the performance:
● Learning rate, momentum, network layers count and size, …
Parameter Selection Strategies:
● Manual
○ Depends on the data scientists skills, high variance, error prone
● Algorithmic
○ Reduced variance, less bias, lower chance of error
○ Strategies include grid search, random search and model based optimization
15
Model Based Hyperparam Optimization
1. Select “best parameters” based on current model.
Params[n+1] = Select(Model[n])
2. Obtain new data points by training the model with new parameters
Metric[n+1] = Train(Params[n+1])
3. Use new data points to update the model.
Model[n+1] = Update(Model[n], Metric[n+1])
HyperParameter Tuning With MLfLow
17
HyperParam
Search Run
Train Model
Run
mlflow.log_metric()mlflow.get_metric()
Projects
Tracking
Models
mlflow run ... Logged Model
mlflow.log_artifact
MLflow HyperParameter Example
You can find this example at mlflow/examples/hyperparam.
Goal: predict wine quality from measured properties
• data: acidity, sugar content, chlorides, alcohol, ... "
• target: quality score, integer between 3 and 9
• metric: “rmse”
The MLproject has following entry points:
• train - train deep learning model with Keras, has two tunable parameters: learning
rate and momentum
• hyperparam train with random, hyperopt, gpyopt
18
Thank you
19

More Related Content

PDF
MLFlow: Platform for Complete Machine Learning Lifecycle
PDF
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
PDF
Managing the Complete Machine Learning Lifecycle with MLflow
PPTX
Pythonsevilla2019 - Introduction to MLFlow
PPTX
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
PDF
MLOps Using MLflow
PDF
Simplifying Model Management with MLflow
PDF
MLflow: A Platform for Production Machine Learning
MLFlow: Platform for Complete Machine Learning Lifecycle
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
Managing the Complete Machine Learning Lifecycle with MLflow
Pythonsevilla2019 - Introduction to MLFlow
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
MLOps Using MLflow
Simplifying Model Management with MLflow
MLflow: A Platform for Production Machine Learning

What's hot (20)

PDF
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
PPTX
MLOps in action
PDF
"Managing the Complete Machine Learning Lifecycle with MLflow"
PDF
Introduction to MLflow
PDF
MLflow with Databricks
PDF
MLflow Model Serving
PDF
mlflow: Accelerating the End-to-End ML lifecycle
PDF
MLOps for production-level machine learning
PDF
PDF
Apply MLOps at Scale by H&M
PDF
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
PDF
Databricks Overview for MLOps
PPTX
From Data Science to MLOps
PDF
MLOps by Sasha Rosenbaum
PPTX
Google Vertex AI
PDF
Seamless MLOps with Seldon and MLflow
PDF
Vector databases and neural search
PDF
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
PDF
Learn to Use Databricks for the Full ML Lifecycle
PDF
Ml ops past_present_future
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLOps in action
"Managing the Complete Machine Learning Lifecycle with MLflow"
Introduction to MLflow
MLflow with Databricks
MLflow Model Serving
mlflow: Accelerating the End-to-End ML lifecycle
MLOps for production-level machine learning
Apply MLOps at Scale by H&M
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
Databricks Overview for MLOps
From Data Science to MLOps
MLOps by Sasha Rosenbaum
Google Vertex AI
Seamless MLOps with Seldon and MLflow
Vector databases and neural search
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
Learn to Use Databricks for the Full ML Lifecycle
Ml ops past_present_future
Ad

Similar to Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating Custom Libraries (20)

PDF
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
PDF
Scaling up Machine Learning Development
PDF
MLFlow 1.0 Meetup
PDF
Utilisation de MLflow pour le cycle de vie des projet Machine learning
PPTX
Python for Machine Learning_ A Comprehensive Overview.pptx
PDF
Managing the Machine Learning Lifecycle with MLflow
PDF
Tuning the Untunable - Insights on Deep Learning Optimization
PDF
Strata parallel m-ml-ops_sept_2017
PDF
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
PDF
SigOpt at GTC - Reducing operational barriers to optimization
PPTX
CNCF-Istanbul-MLOps for Devops Engineers.pptx
PDF
artificggggggggggggggialintelligence.pdf
PPTX
Nasscom ml ops webinar
PPTX
ML Ops Tools ML flow and Hugging Face(2).pptx
PDF
Monitoring AI with AI
PDF
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
PDF
The A-Z of Data: Introduction to MLOps
PDF
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PDF
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PDF
EPAM ML/AI Accelerator - ODAHU
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
Scaling up Machine Learning Development
MLFlow 1.0 Meetup
Utilisation de MLflow pour le cycle de vie des projet Machine learning
Python for Machine Learning_ A Comprehensive Overview.pptx
Managing the Machine Learning Lifecycle with MLflow
Tuning the Untunable - Insights on Deep Learning Optimization
Strata parallel m-ml-ops_sept_2017
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
SigOpt at GTC - Reducing operational barriers to optimization
CNCF-Istanbul-MLOps for Devops Engineers.pptx
artificggggggggggggggialintelligence.pdf
Nasscom ml ops webinar
ML Ops Tools ML flow and Hugging Face(2).pptx
Monitoring AI with AI
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
The A-Z of Data: Introduction to MLOps
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
EPAM ML/AI Accelerator - ODAHU
Ad

More from Databricks (20)

PPTX
DW Migration Webinar-March 2022.pptx
PPTX
Data Lakehouse Symposium | Day 1 | Part 1
PPT
Data Lakehouse Symposium | Day 1 | Part 2
PPTX
Data Lakehouse Symposium | Day 2
PPTX
Data Lakehouse Symposium | Day 4
PDF
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
PDF
Democratizing Data Quality Through a Centralized Platform
PDF
Learn to Use Databricks for Data Science
PDF
Why APM Is Not the Same As ML Monitoring
PDF
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
PDF
Stage Level Scheduling Improving Big Data and AI Integration
PDF
Simplify Data Conversion from Spark to TensorFlow and PyTorch
PDF
Scaling your Data Pipelines with Apache Spark on Kubernetes
PDF
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
PDF
Sawtooth Windows for Feature Aggregations
PDF
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
PDF
Re-imagine Data Monitoring with whylogs and Spark
PDF
Raven: End-to-end Optimization of ML Prediction Queries
PDF
Processing Large Datasets for ADAS Applications using Apache Spark
PDF
Massive Data Processing in Adobe Using Delta Lake
DW Migration Webinar-March 2022.pptx
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 4
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Democratizing Data Quality Through a Centralized Platform
Learn to Use Databricks for Data Science
Why APM Is Not the Same As ML Monitoring
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Stage Level Scheduling Improving Big Data and AI Integration
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Sawtooth Windows for Feature Aggregations
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Re-imagine Data Monitoring with whylogs and Spark
Raven: End-to-end Optimization of ML Prediction Queries
Processing Large Datasets for ADAS Applications using Apache Spark
Massive Data Processing in Adobe Using Delta Lake

Recently uploaded (20)

PDF
IT Consulting Services to Secure Future Growth
PDF
solman-7.0-ehp1-sp21-incident-management
PPTX
FLIGHT TICKET API | API INTEGRATION PLATFORM
PDF
Odoo Construction Management System by CandidRoot
PDF
Mobile App Backend Development with WordPress REST API: The Complete eBook
PPTX
Folder Lock 10.1.9 Crack With Serial Key
PPTX
Why 2025 Is the Best Year to Hire Software Developers in India
PPTX
Streamlining Project Management in the AV Industry with D-Tools for Zoho CRM ...
PPTX
MCP empowers AI Agents from Zero to Production
PDF
How to Set Realistic Project Milestones and Deadlines
PPTX
UNIT II: Software design, software .pptx
PDF
Multiverse AI Review 2025_ The Ultimate All-in-One AI Platform.pdf
PPTX
Beige and Black Minimalist Project Deck Presentation (1).pptx
PPTX
Independent Consultants’ Biggest Challenges in ERP Projects – and How Apagen ...
PPTX
Relevance Tuning with Genetic Algorithms
PPTX
DevOpsDays Halifax 2025 - Building 10x Organizations Using Modern Productivit...
PPTX
WJQSJXNAZJVCVSAXJHBZKSJXKJKXJSBHJBJEHHJB
PPT
3.Software Design for software engineering
PPTX
Presentation - Summer Internship at Samatrix.io_template_2.pptx
PDF
Streamlining Project Management in Microsoft Project, Planner, and Teams with...
IT Consulting Services to Secure Future Growth
solman-7.0-ehp1-sp21-incident-management
FLIGHT TICKET API | API INTEGRATION PLATFORM
Odoo Construction Management System by CandidRoot
Mobile App Backend Development with WordPress REST API: The Complete eBook
Folder Lock 10.1.9 Crack With Serial Key
Why 2025 Is the Best Year to Hire Software Developers in India
Streamlining Project Management in the AV Industry with D-Tools for Zoho CRM ...
MCP empowers AI Agents from Zero to Production
How to Set Realistic Project Milestones and Deadlines
UNIT II: Software design, software .pptx
Multiverse AI Review 2025_ The Ultimate All-in-One AI Platform.pdf
Beige and Black Minimalist Project Deck Presentation (1).pptx
Independent Consultants’ Biggest Challenges in ERP Projects – and How Apagen ...
Relevance Tuning with Genetic Algorithms
DevOpsDays Halifax 2025 - Building 10x Organizations Using Modern Productivit...
WJQSJXNAZJVCVSAXJHBZKSJXKJKXJSBHJBJEHHJB
3.Software Design for software engineering
Presentation - Summer Internship at Samatrix.io_template_2.pptx
Streamlining Project Management in Microsoft Project, Planner, and Teams with...

Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating Custom Libraries

  • 1. Advanced MLflow Matei Zaharia, Sid Murching, Tomas Nykodym September 20th, 2018 1
  • 2. What is MLflow? Tracking Record and query experiments: code, data, config, results Projects Packaging format for reproducible runs on any platform Models General model format that supports diverse deployment tools Open source platform to accelerate the ML lifecycle
  • 3. What’s New in MLflow? Many updates since our last meetup at MLflow 0.2.1! • Java API • PyTorch, Keras, H2O, GCS, SFTP integrations • APIs for tags & querying run history • Optimized Spark ML serving with MLeap • UX improvements 3
  • 4. What’s New in MLflow? 4
  • 5. What’s New in MLflow? 5
  • 6. Ongoing Development R API for MLflow (led by RStudio) • See Javier’s talk later in this meetup! CRUD UI and APIs (naming, annotating & deleting runs) Improved experiment view and search UX 6
  • 7. Learning More About MLflow pip install mlflow to get started Find docs & examples at mlflow.org tinyurl.com/mlflow-slack 7
  • 8. This Meetup Go beyond the basics to show how to use MLflow’s components for complex ML workflows • Multi-step workflows with caching • Hyperparameter tuning 8
  • 10. Project Spec Code DataConfig Local Execution Remote Execution
  • 11. MLflow Projects ● With projects: parametrized, dependency-agnostic runs of arbitrary code ● Can chain projects into multistep workflows ● Tracking server: source of truth for output of individual steps 11
  • 12. Multistep Workflows Can debug & develop steps independently 12
  • 13. ● Find this example at mlflow/examples/multistep_workflow ● MovieLens: given user and movie, predict a rating Demo 13
  • 15. Hyperparameter Optimization ML Algorithms have many parameters affecting the performance: ● Learning rate, momentum, network layers count and size, … Parameter Selection Strategies: ● Manual ○ Depends on the data scientists skills, high variance, error prone ● Algorithmic ○ Reduced variance, less bias, lower chance of error ○ Strategies include grid search, random search and model based optimization 15
  • 16. Model Based Hyperparam Optimization 1. Select “best parameters” based on current model. Params[n+1] = Select(Model[n]) 2. Obtain new data points by training the model with new parameters Metric[n+1] = Train(Params[n+1]) 3. Use new data points to update the model. Model[n+1] = Update(Model[n], Metric[n+1])
  • 17. HyperParameter Tuning With MLfLow 17 HyperParam Search Run Train Model Run mlflow.log_metric()mlflow.get_metric() Projects Tracking Models mlflow run ... Logged Model mlflow.log_artifact
  • 18. MLflow HyperParameter Example You can find this example at mlflow/examples/hyperparam. Goal: predict wine quality from measured properties • data: acidity, sugar content, chlorides, alcohol, ... " • target: quality score, integer between 3 and 9 • metric: “rmse” The MLproject has following entry points: • train - train deep learning model with Keras, has two tunable parameters: learning rate and momentum • hyperparam train with random, hyperopt, gpyopt 18