SlideShare a Scribd company logo
From Training to Explainability via GitOps
Kubeflow Contributor Summit
October 2019
Outline
- Background: What Customers want from Kubeflow
- Time to value
- Governance
- How best to get to live predictions?
- GitOps - why and how
- Pipeline to serving walkthrough with
- Oversight
- Observability
- Explainability
What Customers want from an ML Platform
Empowerment/Time to value
● Self-service for data science
● DS & Ops collaboration
● Sandboxing
● Repeatable approaches
Governance
● Visibility and oversight of running models
● Detailed monitoring
● Audit trails
● Access control
● Repeatable approaches
● Explainability
Kubeflow ticks these boxes!
Kubeflow for Collaboration
- Jupyter
- Collaboration
- Sandboxing (inc fairing)
- Share repeatable approaches
- Pipelines
- Data science and Ops collaboration
- Repeatable approaches
- Audit Trails (governance)
Kubeflow for Governance
- Metadata/Artifact Management
- Track what produced when and how
- Multi User Isolation
- Control who can do what
Path to Live Serving
Those features aimed at exploration and training
Multiple paths to serving (live predictions) with kubeflow.
How best to get from training to serving?
How do we get to serving with empowerment and governance?
GitOps for Live Serving
● Cluster state represented declaratively
● ArgoCD/Flux/Jenkins-X
● Audit trails and reverts
● Git permissions
● Favourite with Ops
Ok to push to cluster for sandboxing.
GitOps great option for prod… but how best to do it?
From Experimentation To
Explainability
Example with GitOps
The scenario
● Classify income (as high or low) based on US Census features incl. age,
gender, race, marital status
● Train a scikit-learn classifier
● Deploy from kubeflow pipeline via GitOps
● Serve requests with Seldon
● Deploy alibi explainer and explain predictions
Build Model
- Model is income classifier
- Build alibi explainer together with model
# train an RF model
np.random.seed(0)
clf = RandomForestClassifier(n_estimators=50)
#clf.fit(preprocessor.transform(X_train), Y_train)
pipeline = Pipeline([('preprocessor', preprocessor),
('clf', clf)])
pipeline.fit(X_train, Y_train)
print(X_train.shape)
print(pipeline.predict(X_train[0:1]))
print("Creating an explainer")
predict_fn = lambda x: pipeline.predict_proba(x)
predict_fn(X_train[0:1])
predict_fn(np.zeros([1, len(feature_names)]))
explainer = alibi.explainers.AnchorTabular(predict_fn=predict_fn,
feature_names=feature_names,
categorical_names=category_map)
explainer.fit(X_train)
explainer.predict_fn = None # Clear explainer predict_fn as its a lambda and will be reset when loaded
with open("explainer.dill", 'wb') as f:
dill.dump(explainer,f)
Seldon GitOps Serving apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
name: sklearn
spec:
name: iris
predictors:
- graph:
children: []
implementation: SKLEARN_SERVER
modelUri: gs://seldon-models/sklearn/iris
name: classifier
name: default
replicas: 1
Model in storage bucket
Manifest in Git
KFServing too
Kubeflow Pipeline
GitOps from Pipeline
@dsl.pipeline(
name="Serving gitops",
description="Example of pushing to git from pipeline"
)
#Example to show how serving yaml can be pushed to git
def
serve_gitops(user='ryandawsonuk',email='rd@seldon.io',git_token='xxxxxxxxxx',file='https://raw.githubusercontent.com/ryandawsonuk/seldon_gitop
s_repo_old1/master/default/SeldonDeployment-income-classifier2.json',filename='SeldonDeployment-income-classifier2.json'):
#push file to serving repo
push = dsl.ContainerOp(
name="push",
image="alpine/git:latest",
command=["sh", "-c"],
arguments=["git config --global url.'https://"+str(git_token)+":@github.com/'.insteadOf 'https://github.com/'; wget "+str(file)+"; git
config --global user.name '"+str(user)+"'; git config --global user.email "+str(email)+"; git clone https://github.com/"+str(user)+"/seldon-
gitops; ls; cp "+str(filename)+" ./seldon-gitops/default/"+str(filename)+"; cd ./seldon-gitops/; git add .; git commit -m 'add
"+str(filename)+"'; git push -u origin master;"]
)
GitOps for Serving
● Great for Data Science and Ops Collaboration
GitOps and Namespaces
Observability
Now we know what’s running…
So what is it doing?
Metrics Visibility
Metrics In Action
Sidenote: Access Control
Can’t have metrics without requests
Access from curl or Seldon UI predict/load-test
If you don’t have an existing auth preference we like...
Explainability
So now we know what it’s doing…
Why is it doing that?
Request Logging
To ask why, need to first know what happened
Explainer Deployment
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
name: income
spec:
name: income
predictors:
- graph:
children: []
implementation: SKLEARN_SERVER
modelUri: gs://seldon-models/sklearn/income/model
name: classifier
explainer:
type: anchor_tabular
modelUri: gs://seldon-models/sklearn/income/explainer
name: default
replicas: 1
Declarative yaml
Wizards for time to value & sandboxing
Alibi Explainers
- Includes techniques for black-box models
- We’ll use anchors for tabular data
- Anchors are sufficient conditions to ensure a certain prediction
- As long as the anchor holds, the prediction should remain the same
regardless of the values of the other features
- Anchors are chosen to maximise the range for which the prediction holds
Alibi Explanations
Predict-Explain Flow
Wrap-up
● What Seldon Customers want
○ Time to value
○ Governance
● GitOps helps with both
● Pipeline to serving walkthrough with
○ Oversight
○ Observability
○ Explainability
The Future
Very excited about:
● Metadata integrations
● Permissions
● KFServing and MLGraph

More Related Content

What's hot (20)

PDF
A GitOps Kubernetes Native CICD Solution with Argo Events, Workflows, and CD
Julian Mazzitelli
 
PDF
GitOps Toolkit (Cloud Native Nordics Tech Talk)
Weaveworks
 
PDF
DevOps: The Future of Software Development
Opsta
 
PDF
Cloud Native CI/CD with GitOps
Kasper Nissen
 
PDF
GitOps with Gitkube
Tirumarai Selvan
 
PDF
The Power of GitOps with Flux & GitOps Toolkit
Weaveworks
 
PDF
Serverless with Knative - Mete Atamel (Google)
Shift Conference
 
PDF
Git ops: Git based application deployment patterns for Kubernetes
Shahidh K Muhammed
 
PDF
From airflow to google cloud composer
Bruce Kuo
 
PDF
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
DataWorks Summit
 
PDF
Lessons learned from the charts repo
Victor Iglesias
 
PDF
ОЛЕГ МАЦЬКІВ «Crash course on Operator Framework» Lviv DevOps Conference 2019
UA DevOps Conference
 
PPTX
Helm at reddit: from local dev, staging, to production
Gregory Taylor
 
PDF
Thinking One Step Further with Time-saving DevOps Tools with Open Telekom Clo...
Bitnami
 
PDF
Gitops Hands On
Brice Fernandes
 
PDF
GitOps for Helm Users by Scott Rigby
Weaveworks
 
PPTX
Kube cfg-mgmt
Lee Briggs
 
PDF
GitOps - Operation By Pull Request
Kasper Nissen
 
PDF
Kube Your Enthusiasm - Paul Czarkowski
VMware Tanzu
 
PDF
Microservices at Mercari
Google Cloud Platform - Japan
 
A GitOps Kubernetes Native CICD Solution with Argo Events, Workflows, and CD
Julian Mazzitelli
 
GitOps Toolkit (Cloud Native Nordics Tech Talk)
Weaveworks
 
DevOps: The Future of Software Development
Opsta
 
Cloud Native CI/CD with GitOps
Kasper Nissen
 
GitOps with Gitkube
Tirumarai Selvan
 
The Power of GitOps with Flux & GitOps Toolkit
Weaveworks
 
Serverless with Knative - Mete Atamel (Google)
Shift Conference
 
Git ops: Git based application deployment patterns for Kubernetes
Shahidh K Muhammed
 
From airflow to google cloud composer
Bruce Kuo
 
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
DataWorks Summit
 
Lessons learned from the charts repo
Victor Iglesias
 
ОЛЕГ МАЦЬКІВ «Crash course on Operator Framework» Lviv DevOps Conference 2019
UA DevOps Conference
 
Helm at reddit: from local dev, staging, to production
Gregory Taylor
 
Thinking One Step Further with Time-saving DevOps Tools with Open Telekom Clo...
Bitnami
 
Gitops Hands On
Brice Fernandes
 
GitOps for Helm Users by Scott Rigby
Weaveworks
 
Kube cfg-mgmt
Lee Briggs
 
GitOps - Operation By Pull Request
Kasper Nissen
 
Kube Your Enthusiasm - Paul Czarkowski
VMware Tanzu
 
Microservices at Mercari
Google Cloud Platform - Japan
 

Similar to From training to explainability via git ops (20)

PDF
AI/ML Infra Meetup | ML explainability in Michelangelo
Alluxio, Inc.
 
PPTX
Why is dev ops for machine learning so different
Ryan Dawson
 
PPTX
Why is dev ops for machine learning so different - dataxdays
Ryan Dawson
 
PDF
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
Costanoa Ventures
 
PDF
Michelangelo - Machine Learning Platform - 2018
Karthik Murugesan
 
PPTX
Explainable AI in Industry (FAT* 2020 Tutorial)
Krishnaram Kenthapadi
 
PDF
DevOps Fest 2019. Gianluca Arbezzano. DevOps never sleeps. What we learned fr...
DevOps_Fest
 
PPTX
Notes on Deploying Machine-learning Models at Scale
Deep Kayal
 
PDF
A survey on Machine Learning In Production (July 2018)
Arnab Biswas
 
PDF
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
PAPIs.io
 
PDF
The Hitchhiker's Guide to Machine Learning with Python & Apache Spark
Krishna Sankar
 
PDF
Can Machine Learning Models be Trusted? Explaining Decisions of ML Models
Darek Smyk
 
PDF
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
GetInData
 
PDF
DutchMLSchool 2022 - Automation
BigML, Inc
 
PPTX
Challenges for AI in prod
Ryan Dawson
 
PPTX
Improving How We Deliver Machine Learning Models (XCONF 2019)
David Tan
 
PDF
Scaling Ride-Hailing with Machine Learning on MLflow
Databricks
 
PPTX
From Data Science to MLOps
Carl W. Handlin
 
PDF
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
Databricks
 
PPTX
Explainable AI in Industry (KDD 2019 Tutorial)
Krishnaram Kenthapadi
 
AI/ML Infra Meetup | ML explainability in Michelangelo
Alluxio, Inc.
 
Why is dev ops for machine learning so different
Ryan Dawson
 
Why is dev ops for machine learning so different - dataxdays
Ryan Dawson
 
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
Costanoa Ventures
 
Michelangelo - Machine Learning Platform - 2018
Karthik Murugesan
 
Explainable AI in Industry (FAT* 2020 Tutorial)
Krishnaram Kenthapadi
 
DevOps Fest 2019. Gianluca Arbezzano. DevOps never sleeps. What we learned fr...
DevOps_Fest
 
Notes on Deploying Machine-learning Models at Scale
Deep Kayal
 
A survey on Machine Learning In Production (July 2018)
Arnab Biswas
 
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
PAPIs.io
 
The Hitchhiker's Guide to Machine Learning with Python & Apache Spark
Krishna Sankar
 
Can Machine Learning Models be Trusted? Explaining Decisions of ML Models
Darek Smyk
 
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
GetInData
 
DutchMLSchool 2022 - Automation
BigML, Inc
 
Challenges for AI in prod
Ryan Dawson
 
Improving How We Deliver Machine Learning Models (XCONF 2019)
David Tan
 
Scaling Ride-Hailing with Machine Learning on MLflow
Databricks
 
From Data Science to MLOps
Carl W. Handlin
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
Databricks
 
Explainable AI in Industry (KDD 2019 Tutorial)
Krishnaram Kenthapadi
 
Ad

More from Ryan Dawson (11)

PPTX
mlops.community meetup - ML Governance_ A Practical Guide.pptx
Ryan Dawson
 
PPTX
Conspiracy Theories in the Information Age
Ryan Dawson
 
PPTX
Maximising teamwork in delivering software products
Ryan Dawson
 
PPTX
Maximising teamwork in delivering software products
Ryan Dawson
 
PPTX
Java vs challenger languages
Ryan Dawson
 
PPTX
How open source is funded the enterprise differentiation tightrope (1)
Ryan Dawson
 
PPTX
From java monolith to kubernetes microservices - an open source journey with ...
Ryan Dawson
 
PPTX
Whirlwind tour of activiti 7
Ryan Dawson
 
PPTX
Jdk.io cloud native business automation
Ryan Dawson
 
PPTX
Identity management and single sign on - how much flexibility
Ryan Dawson
 
PPTX
Activiti Cloud Deep Dive
Ryan Dawson
 
mlops.community meetup - ML Governance_ A Practical Guide.pptx
Ryan Dawson
 
Conspiracy Theories in the Information Age
Ryan Dawson
 
Maximising teamwork in delivering software products
Ryan Dawson
 
Maximising teamwork in delivering software products
Ryan Dawson
 
Java vs challenger languages
Ryan Dawson
 
How open source is funded the enterprise differentiation tightrope (1)
Ryan Dawson
 
From java monolith to kubernetes microservices - an open source journey with ...
Ryan Dawson
 
Whirlwind tour of activiti 7
Ryan Dawson
 
Jdk.io cloud native business automation
Ryan Dawson
 
Identity management and single sign on - how much flexibility
Ryan Dawson
 
Activiti Cloud Deep Dive
Ryan Dawson
 
Ad

Recently uploaded (20)

PDF
Online Queue Management System for Public Service Offices in Nepal [Focused i...
Rishab Acharya
 
PDF
vMix Pro 28.0.0.42 Download vMix Registration key Bundle
kulindacore
 
PDF
Alarm in Android-Scheduling Timed Tasks Using AlarmManager in Android.pdf
Nabin Dhakal
 
PPTX
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
PDF
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
PPTX
Help for Correlations in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PPTX
Change Common Properties in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PPTX
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
PPTX
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pptx
Varsha Nayak
 
PDF
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
PDF
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
PDF
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
PDF
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
PDF
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
PDF
Alexander Marshalov - How to use AI Assistants with your Monitoring system Q2...
VictoriaMetrics
 
PDF
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
PDF
SciPy 2025 - Packaging a Scientific Python Project
Henry Schreiner
 
PPTX
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
PDF
Revenue streams of the Wazirx clone script.pdf
aaronjeffray
 
PDF
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 
Online Queue Management System for Public Service Offices in Nepal [Focused i...
Rishab Acharya
 
vMix Pro 28.0.0.42 Download vMix Registration key Bundle
kulindacore
 
Alarm in Android-Scheduling Timed Tasks Using AlarmManager in Android.pdf
Nabin Dhakal
 
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
Help for Correlations in IBM SPSS Statistics.pptx
Version 1 Analytics
 
Change Common Properties in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pptx
Varsha Nayak
 
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
Alexander Marshalov - How to use AI Assistants with your Monitoring system Q2...
VictoriaMetrics
 
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
SciPy 2025 - Packaging a Scientific Python Project
Henry Schreiner
 
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
Revenue streams of the Wazirx clone script.pdf
aaronjeffray
 
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 

From training to explainability via git ops

  • 1. From Training to Explainability via GitOps Kubeflow Contributor Summit October 2019
  • 2. Outline - Background: What Customers want from Kubeflow - Time to value - Governance - How best to get to live predictions? - GitOps - why and how - Pipeline to serving walkthrough with - Oversight - Observability - Explainability
  • 3. What Customers want from an ML Platform Empowerment/Time to value ● Self-service for data science ● DS & Ops collaboration ● Sandboxing ● Repeatable approaches Governance ● Visibility and oversight of running models ● Detailed monitoring ● Audit trails ● Access control ● Repeatable approaches ● Explainability Kubeflow ticks these boxes!
  • 4. Kubeflow for Collaboration - Jupyter - Collaboration - Sandboxing (inc fairing) - Share repeatable approaches - Pipelines - Data science and Ops collaboration - Repeatable approaches - Audit Trails (governance)
  • 5. Kubeflow for Governance - Metadata/Artifact Management - Track what produced when and how - Multi User Isolation - Control who can do what
  • 6. Path to Live Serving Those features aimed at exploration and training Multiple paths to serving (live predictions) with kubeflow. How best to get from training to serving? How do we get to serving with empowerment and governance?
  • 7. GitOps for Live Serving ● Cluster state represented declaratively ● ArgoCD/Flux/Jenkins-X ● Audit trails and reverts ● Git permissions ● Favourite with Ops Ok to push to cluster for sandboxing. GitOps great option for prod… but how best to do it?
  • 9. The scenario ● Classify income (as high or low) based on US Census features incl. age, gender, race, marital status ● Train a scikit-learn classifier ● Deploy from kubeflow pipeline via GitOps ● Serve requests with Seldon ● Deploy alibi explainer and explain predictions
  • 10. Build Model - Model is income classifier - Build alibi explainer together with model # train an RF model np.random.seed(0) clf = RandomForestClassifier(n_estimators=50) #clf.fit(preprocessor.transform(X_train), Y_train) pipeline = Pipeline([('preprocessor', preprocessor), ('clf', clf)]) pipeline.fit(X_train, Y_train) print(X_train.shape) print(pipeline.predict(X_train[0:1])) print("Creating an explainer") predict_fn = lambda x: pipeline.predict_proba(x) predict_fn(X_train[0:1]) predict_fn(np.zeros([1, len(feature_names)])) explainer = alibi.explainers.AnchorTabular(predict_fn=predict_fn, feature_names=feature_names, categorical_names=category_map) explainer.fit(X_train) explainer.predict_fn = None # Clear explainer predict_fn as its a lambda and will be reset when loaded with open("explainer.dill", 'wb') as f: dill.dump(explainer,f)
  • 11. Seldon GitOps Serving apiVersion: machinelearning.seldon.io/v1alpha2 kind: SeldonDeployment metadata: name: sklearn spec: name: iris predictors: - graph: children: [] implementation: SKLEARN_SERVER modelUri: gs://seldon-models/sklearn/iris name: classifier name: default replicas: 1 Model in storage bucket Manifest in Git KFServing too
  • 13. GitOps from Pipeline @dsl.pipeline( name="Serving gitops", description="Example of pushing to git from pipeline" ) #Example to show how serving yaml can be pushed to git def serve_gitops(user='ryandawsonuk',email='[email protected]',git_token='xxxxxxxxxx',file='https://raw.githubusercontent.com/ryandawsonuk/seldon_gitop s_repo_old1/master/default/SeldonDeployment-income-classifier2.json',filename='SeldonDeployment-income-classifier2.json'): #push file to serving repo push = dsl.ContainerOp( name="push", image="alpine/git:latest", command=["sh", "-c"], arguments=["git config --global url.'https://"+str(git_token)+":@github.com/'.insteadOf 'https://github.com/'; wget "+str(file)+"; git config --global user.name '"+str(user)+"'; git config --global user.email "+str(email)+"; git clone https://github.com/"+str(user)+"/seldon- gitops; ls; cp "+str(filename)+" ./seldon-gitops/default/"+str(filename)+"; cd ./seldon-gitops/; git add .; git commit -m 'add "+str(filename)+"'; git push -u origin master;"] )
  • 14. GitOps for Serving ● Great for Data Science and Ops Collaboration
  • 16. Observability Now we know what’s running… So what is it doing?
  • 19. Sidenote: Access Control Can’t have metrics without requests Access from curl or Seldon UI predict/load-test If you don’t have an existing auth preference we like...
  • 20. Explainability So now we know what it’s doing… Why is it doing that?
  • 21. Request Logging To ask why, need to first know what happened
  • 22. Explainer Deployment apiVersion: machinelearning.seldon.io/v1alpha2 kind: SeldonDeployment metadata: name: income spec: name: income predictors: - graph: children: [] implementation: SKLEARN_SERVER modelUri: gs://seldon-models/sklearn/income/model name: classifier explainer: type: anchor_tabular modelUri: gs://seldon-models/sklearn/income/explainer name: default replicas: 1 Declarative yaml Wizards for time to value & sandboxing
  • 23. Alibi Explainers - Includes techniques for black-box models - We’ll use anchors for tabular data - Anchors are sufficient conditions to ensure a certain prediction - As long as the anchor holds, the prediction should remain the same regardless of the values of the other features - Anchors are chosen to maximise the range for which the prediction holds
  • 26. Wrap-up ● What Seldon Customers want ○ Time to value ○ Governance ● GitOps helps with both ● Pipeline to serving walkthrough with ○ Oversight ○ Observability ○ Explainability
  • 27. The Future Very excited about: ● Metadata integrations ● Permissions ● KFServing and MLGraph