SlideShare a Scribd company logo
Start with version control and experiments
management in ML:
reproducible experiments
Data Fest3
Minsk, 2019
1
Mikhail Rozhkov
2
Workflow of ML project and artefacts
Problem
Statement
MVP
design
Get data
Prepare data
Train model
Evaluate
modelTest &
Integrate
Serve /
Predict
Monitor
1. Analyze &
Plan
2. Prototype
4. Monitor &
Maintain
3. Productionize
Inspired by Uber’s workflow of a machine learning project diagram. Scaling Machine Learning at Uber with Michelangelo https://eng.uber.com/scaling-michelangelo/
Solution
development
Experiment: pipelines, configs and artifacts
Algorithm
Data
Hyperpara
meters
Evaluation
Measure
Model
ETL
tasks
test
dataset
train
dataset
evaluate
train
Experiment
config - artifacts
- pipelines
- code
- configs
3
ML reproducibility is a dimension of quality
4
What is Reproducibility?
using the original methods applied to
the original data to produce the
original results [Gardner]
Why should you care?
● Trust
● Consistent Results
● Versioned History
● Team Performance
● Pain Less Production
Josh Gardner, Yuming Yang, Ryan S. Baker, Christopher Brooks. Enabling End-To-End Machine
Learning Replicability: A Case Study in Educational Data Mining
Is a “magic button”?
5
ML Reproducibility
1. Automated pipelines
2. Control run params
3. Control execution DAG
4. Code version control
5. Artifacts version control (models, datasets, etc.)
6. Use shared/cloud storage for artifacts
7. Environment dependencies control
6
How to start?
step 1 step 2 step 3 step 4
Manual work
Automated
work
Time on DS
task
100%
0%
10%
90%
10%
90% 90%
7
Start with artifacts versioning!
Algorithm
Data
Hyperpara
meters
Evaluation
Measure
Model
ETL
tasks
test
dataset
train
dataset
evaluate
train
Experiment
config
8
Use Case: dogs and cats classifier
● Project
○ Classify dogs and cats by photo
○ Datat
■ object: cats, dogs
■ dogs: 12500
■ cats: 12500
○ Metrics: accuracy, ROC-AUC
9
● Team
○ > 2 members
○ different machines/servers
○ different OS
○ git-flow dev process
○ run on one machine
Step 1:
Jupyter Notebook
● code in Jupyter Notebook
● everything in Docker
10
ML Reproducibility checklist
11
1. Automated pipelines
2. Control run params
3. Control execution DAG
4. Code version control
5. Artifacts version control (models, datasets, etc.)
6. Use shared/cloud storage for artifacts
7. Environment dependencies control
8. Experiments results tracking
Step 2:
build pipelines
● move common code into .py modules
● build pipelines
● everything in Docker
● run experiments in terminal or Jupyter Notebook
12
Model
train
train report
index
Data
config
evaluate test reportindex
Data
config
split index
Data
config
Setup pipelines
13
ML Reproducibility checklist
14
1. Automated pipelines
2. Control run params
3. Control execution DAG
4. Code version control
5. Artifacts version control (models, datasets, etc.)
6. Use shared/cloud storage for artifacts
7. Environment dependencies control
8. Experiments results tracking
Step 3:
add version control
for artifacts
15
● add models/data/congis under DVC control
● same code in .py modules
● same pipelines
● everything in Docker
● run experiments in terminal or Jupyter Notebook
ML Reproducibility checklist
16
1. Automated pipelines
2. Control run params
3. Control execution DAG
4. Code version control
5. Artifacts version control (models, datasets, etc.)
6. Use shared/cloud storage for artifacts
7. Environment dependencies control
8. Experiments results tracking
Step 4:
add execution
DAG control
● add pipelines dependencies under DVC control
● models/data/congis under DVC control
● same code in .py modules
● same pipelines
● everything in Docker
● run experiments in terminal or Jupyter Notebook
17
Experiment
config
train config
eval config
split config
prepare
config
Model
train
train report
index
Data
config
evaluate test reportindex
Data
config
split index
Data
config
Setup pipelines
18
ML Reproducibility checklist
19
1. Automated pipelines
2. Control run params
3. Control execution DAG
4. Code version control
5. Artifacts version control (models, datasets, etc.)
6. Use shared/cloud storage for artifacts
7. Environment dependencies control
8. Experiments results tracking
Step 5:
add experiments
control
● add experiments benchmark (DVC, mlflow)
● pipelines dependencies under DVC control
● models/data/congis under DVC control
● same code in .py modules
● same pipelines
● everything in Docker
● run experiments in terminal or Jupyter Notebook
20
Metrics tracking in mlflow UI
21
from mlflow import log_metric, log_param,
log_artifact
log_artifact(args.config)
log_param('batch_size', config['batch_size'])
log_metric('f1', f1)
log_metric('roc_auc', roc_auc)
Experiments benchmarking
22
runs
params metrics
ML Reproducibility checklist
23
1. Automated pipelines
2. Control run params
3. Control execution DAG
4. Code version control
5. Artifacts version control (models, datasets, etc.)
6. Use shared/cloud storage for artifacts
7. Environment dependencies control
8. Experiments results tracking
Conclusions
1. pipelines - not difficult
2. start where you detect a “copy-paste” pattern
3. artifacts version control - MUST
4. discipline in a team is important
5. more benefits for high complexity and large team projects
24
Contact me
25
Mikhail Rozhkov
mail: mnrozhkov@gmail.com
ods: @Mikhail Rozhkov

More Related Content

Similar to Start with version control and experiments management in machine learning (20)

PDF
StarWest 2019 - End to end testing: Stupid or Legit?
mabl
 
PDF
Reproducibility and experiments management in Machine Learning
Mikhail Rozhkov
 
PDF
End to-end test automation at scale
mabl
 
PDF
DevOps for TYPO3 Teams and Projects
Fedir RYKHTIK
 
PDF
Agile Data Science on Greenplum Using Airflow - Greenplum Summit 2019
VMware Tanzu
 
PDF
Software Delivery in 2016 - A Continuous Delivery Approach
Giovanni Toraldo
 
PDF
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Sotrender
 
PDF
Scaling Ride-Hailing with Machine Learning on MLflow
Databricks
 
PDF
AdaCore Paris Tech Day 2016: Jose Ruiz - QGen Tech Update
jamieayre
 
PDF
Building successful and secure products with AI and ML
Simon Lia-Jonassen
 
PDF
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Fwdays
 
PDF
Hydrosphere.io for ODSC: Webinar on Kubeflow
Rustem Zakiev
 
PDF
Tools for Test-Driven Product Modeling
Tim Geisler
 
ODP
Moodle Development Best Pracitces
Justin Filip
 
PDF
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
Robert Grossman
 
PDF
QA Meetup at Signavio (Berlin, 06.06.19)
Anesthezia
 
PPTX
Measurement .Net Performance with BenchmarkDotNet
Vasyl Senko
 
PDF
Un puente enre MLops y Devops con Openshift AI
Juan Vicente Herrera Ruiz de Alejo
 
PDF
Presentation Verification & Validation
Elmar Selbach
 
PDF
Monitoring AI with AI
Stepan Pushkarev
 
StarWest 2019 - End to end testing: Stupid or Legit?
mabl
 
Reproducibility and experiments management in Machine Learning
Mikhail Rozhkov
 
End to-end test automation at scale
mabl
 
DevOps for TYPO3 Teams and Projects
Fedir RYKHTIK
 
Agile Data Science on Greenplum Using Airflow - Greenplum Summit 2019
VMware Tanzu
 
Software Delivery in 2016 - A Continuous Delivery Approach
Giovanni Toraldo
 
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Sotrender
 
Scaling Ride-Hailing with Machine Learning on MLflow
Databricks
 
AdaCore Paris Tech Day 2016: Jose Ruiz - QGen Tech Update
jamieayre
 
Building successful and secure products with AI and ML
Simon Lia-Jonassen
 
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Fwdays
 
Hydrosphere.io for ODSC: Webinar on Kubeflow
Rustem Zakiev
 
Tools for Test-Driven Product Modeling
Tim Geisler
 
Moodle Development Best Pracitces
Justin Filip
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
Robert Grossman
 
QA Meetup at Signavio (Berlin, 06.06.19)
Anesthezia
 
Measurement .Net Performance with BenchmarkDotNet
Vasyl Senko
 
Un puente enre MLops y Devops con Openshift AI
Juan Vicente Herrera Ruiz de Alejo
 
Presentation Verification & Validation
Elmar Selbach
 
Monitoring AI with AI
Stepan Pushkarev
 

More from Mikhail Rozhkov (15)

PDF
Школа Tech-In.RU: Cеминар 1. Основы работы с Ардуино (Аrduino) и Обзор hardwa...
Mikhail Rozhkov
 
PPTX
How to improve performance of team members? Consider competencies and context!
Mikhail Rozhkov
 
PPTX
Применение Arduino (Ардуино) в школе. Сообщество Tech-In.ru
Mikhail Rozhkov
 
PDF
Tech in.ru Опыт проведения семинаров по ардуино, электронике и робототехнике ...
Mikhail Rozhkov
 
PDF
Slides_Workplace context and its effect on individual competencies and perfor...
Mikhail Rozhkov
 
PDF
Study summary_Workplace context and its effect on individual competencies and...
Mikhail Rozhkov
 
PPSX
An initial framework of competency-based knowledge management
Mikhail Rozhkov
 
PDF
Отчет о конференции "Управление знаниями: практика" 2011
Mikhail Rozhkov
 
PDF
Роль знаний в организации
Mikhail Rozhkov
 
PDF
Влияние управления знаниями на конкурентоспособность организаций
Mikhail Rozhkov
 
PPSX
концепция поликтики уз в современном вузе
Mikhail Rozhkov
 
PPSX
организационно-управленческие семинары как инструмент управления знаниями
Mikhail Rozhkov
 
PPSX
Implementation of work-based learning approach in partnership of universities...
Mikhail Rozhkov
 
PPTX
Управление знаниями в университете
Mikhail Rozhkov
 
PDF
Интернет в образовании: путеводитель
Mikhail Rozhkov
 
Школа Tech-In.RU: Cеминар 1. Основы работы с Ардуино (Аrduino) и Обзор hardwa...
Mikhail Rozhkov
 
How to improve performance of team members? Consider competencies and context!
Mikhail Rozhkov
 
Применение Arduino (Ардуино) в школе. Сообщество Tech-In.ru
Mikhail Rozhkov
 
Tech in.ru Опыт проведения семинаров по ардуино, электронике и робототехнике ...
Mikhail Rozhkov
 
Slides_Workplace context and its effect on individual competencies and perfor...
Mikhail Rozhkov
 
Study summary_Workplace context and its effect on individual competencies and...
Mikhail Rozhkov
 
An initial framework of competency-based knowledge management
Mikhail Rozhkov
 
Отчет о конференции "Управление знаниями: практика" 2011
Mikhail Rozhkov
 
Роль знаний в организации
Mikhail Rozhkov
 
Влияние управления знаниями на конкурентоспособность организаций
Mikhail Rozhkov
 
концепция поликтики уз в современном вузе
Mikhail Rozhkov
 
организационно-управленческие семинары как инструмент управления знаниями
Mikhail Rozhkov
 
Implementation of work-based learning approach in partnership of universities...
Mikhail Rozhkov
 
Управление знаниями в университете
Mikhail Rozhkov
 
Интернет в образовании: путеводитель
Mikhail Rozhkov
 
Ad

Recently uploaded (20)

PDF
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
PDF
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
PPTX
Hadoop_EcoSystem slide by CIDAC India.pptx
migbaruget
 
PPTX
This PowerPoint presentation titled "Data Visualization: Turning Data into In...
HemaDivyaKantamaneni
 
PPTX
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 
PPTX
Introduction to Artificial Intelligence.pptx
StarToon1
 
PPTX
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
PPTX
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
PDF
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
PPTX
Rocket-Launched-PowerPoint-Template.pptx
Arden31
 
PDF
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
PDF
Choosing the Right Database for Indexing.pdf
Tamanna
 
PPTX
Climate Action.pptx action plan for climate
justfortalabat
 
PPTX
Slide studies GC- CRC - PC - HNC baru.pptx
LLen8
 
PDF
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
PDF
Incident Response and Digital Forensics Certificate
VICTOR MAESTRE RAMIREZ
 
PDF
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
PDF
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
PPT
Data base management system Transactions.ppt
gandhamcharan2006
 
PPTX
TSM_08_0811111111111111111111111111111111111111111111111
csomonasteriomoscow
 
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
Hadoop_EcoSystem slide by CIDAC India.pptx
migbaruget
 
This PowerPoint presentation titled "Data Visualization: Turning Data into In...
HemaDivyaKantamaneni
 
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 
Introduction to Artificial Intelligence.pptx
StarToon1
 
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
Rocket-Launched-PowerPoint-Template.pptx
Arden31
 
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
Choosing the Right Database for Indexing.pdf
Tamanna
 
Climate Action.pptx action plan for climate
justfortalabat
 
Slide studies GC- CRC - PC - HNC baru.pptx
LLen8
 
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
Incident Response and Digital Forensics Certificate
VICTOR MAESTRE RAMIREZ
 
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
Data base management system Transactions.ppt
gandhamcharan2006
 
TSM_08_0811111111111111111111111111111111111111111111111
csomonasteriomoscow
 
Ad

Start with version control and experiments management in machine learning

  • 1. Start with version control and experiments management in ML: reproducible experiments Data Fest3 Minsk, 2019 1 Mikhail Rozhkov
  • 2. 2 Workflow of ML project and artefacts Problem Statement MVP design Get data Prepare data Train model Evaluate modelTest & Integrate Serve / Predict Monitor 1. Analyze & Plan 2. Prototype 4. Monitor & Maintain 3. Productionize Inspired by Uber’s workflow of a machine learning project diagram. Scaling Machine Learning at Uber with Michelangelo https://eng.uber.com/scaling-michelangelo/ Solution development
  • 3. Experiment: pipelines, configs and artifacts Algorithm Data Hyperpara meters Evaluation Measure Model ETL tasks test dataset train dataset evaluate train Experiment config - artifacts - pipelines - code - configs 3
  • 4. ML reproducibility is a dimension of quality 4 What is Reproducibility? using the original methods applied to the original data to produce the original results [Gardner] Why should you care? ● Trust ● Consistent Results ● Versioned History ● Team Performance ● Pain Less Production Josh Gardner, Yuming Yang, Ryan S. Baker, Christopher Brooks. Enabling End-To-End Machine Learning Replicability: A Case Study in Educational Data Mining
  • 5. Is a “magic button”? 5
  • 6. ML Reproducibility 1. Automated pipelines 2. Control run params 3. Control execution DAG 4. Code version control 5. Artifacts version control (models, datasets, etc.) 6. Use shared/cloud storage for artifacts 7. Environment dependencies control 6
  • 7. How to start? step 1 step 2 step 3 step 4 Manual work Automated work Time on DS task 100% 0% 10% 90% 10% 90% 90% 7
  • 8. Start with artifacts versioning! Algorithm Data Hyperpara meters Evaluation Measure Model ETL tasks test dataset train dataset evaluate train Experiment config 8
  • 9. Use Case: dogs and cats classifier ● Project ○ Classify dogs and cats by photo ○ Datat ■ object: cats, dogs ■ dogs: 12500 ■ cats: 12500 ○ Metrics: accuracy, ROC-AUC 9 ● Team ○ > 2 members ○ different machines/servers ○ different OS ○ git-flow dev process ○ run on one machine
  • 10. Step 1: Jupyter Notebook ● code in Jupyter Notebook ● everything in Docker 10
  • 11. ML Reproducibility checklist 11 1. Automated pipelines 2. Control run params 3. Control execution DAG 4. Code version control 5. Artifacts version control (models, datasets, etc.) 6. Use shared/cloud storage for artifacts 7. Environment dependencies control 8. Experiments results tracking
  • 12. Step 2: build pipelines ● move common code into .py modules ● build pipelines ● everything in Docker ● run experiments in terminal or Jupyter Notebook 12
  • 13. Model train train report index Data config evaluate test reportindex Data config split index Data config Setup pipelines 13
  • 14. ML Reproducibility checklist 14 1. Automated pipelines 2. Control run params 3. Control execution DAG 4. Code version control 5. Artifacts version control (models, datasets, etc.) 6. Use shared/cloud storage for artifacts 7. Environment dependencies control 8. Experiments results tracking
  • 15. Step 3: add version control for artifacts 15 ● add models/data/congis under DVC control ● same code in .py modules ● same pipelines ● everything in Docker ● run experiments in terminal or Jupyter Notebook
  • 16. ML Reproducibility checklist 16 1. Automated pipelines 2. Control run params 3. Control execution DAG 4. Code version control 5. Artifacts version control (models, datasets, etc.) 6. Use shared/cloud storage for artifacts 7. Environment dependencies control 8. Experiments results tracking
  • 17. Step 4: add execution DAG control ● add pipelines dependencies under DVC control ● models/data/congis under DVC control ● same code in .py modules ● same pipelines ● everything in Docker ● run experiments in terminal or Jupyter Notebook 17
  • 18. Experiment config train config eval config split config prepare config Model train train report index Data config evaluate test reportindex Data config split index Data config Setup pipelines 18
  • 19. ML Reproducibility checklist 19 1. Automated pipelines 2. Control run params 3. Control execution DAG 4. Code version control 5. Artifacts version control (models, datasets, etc.) 6. Use shared/cloud storage for artifacts 7. Environment dependencies control 8. Experiments results tracking
  • 20. Step 5: add experiments control ● add experiments benchmark (DVC, mlflow) ● pipelines dependencies under DVC control ● models/data/congis under DVC control ● same code in .py modules ● same pipelines ● everything in Docker ● run experiments in terminal or Jupyter Notebook 20
  • 21. Metrics tracking in mlflow UI 21 from mlflow import log_metric, log_param, log_artifact log_artifact(args.config) log_param('batch_size', config['batch_size']) log_metric('f1', f1) log_metric('roc_auc', roc_auc)
  • 23. ML Reproducibility checklist 23 1. Automated pipelines 2. Control run params 3. Control execution DAG 4. Code version control 5. Artifacts version control (models, datasets, etc.) 6. Use shared/cloud storage for artifacts 7. Environment dependencies control 8. Experiments results tracking
  • 24. Conclusions 1. pipelines - not difficult 2. start where you detect a “copy-paste” pattern 3. artifacts version control - MUST 4. discipline in a team is important 5. more benefits for high complexity and large team projects 24