SlideShare a Scribd company logo
Machine Learning
Integrating Capabilities into your Team
WELCOME
Cameron Vetter
I have 20 years of experience using Microsoft tools and technologies to develop
software. I have experience in many roles including Development, Architecture,
Infrastructure, Management, and Leadership roles. I've worked for some of the largest
companies in the world and for small local companies getting a breadth of experience
in different Corporate Cultures. Currently, I work at SafeNet Consulting, where I get to
do what I love... Architect, Design, and Develop great software! I currently focus on
Microservices, SOA, Azure, Cognitive Toolkit, and Kubernetes.
Principal Cloud Consultant
A Partner to Advise and Support
About SafeNet
Consulting
SafeNet specializes in being partners in your
success. We currently focus on Custom Application
Development, Cloud Consulting Services,
Data & Analytics, and User Experience Strategy.
Introduction
Machine Learning Definition
Tooling and Standards
Integration and Deployment
Versioning
Bias and Ethics
My Recommendations
Question & Answer
Agenda
Machine Learning Definition
Wikipedia’s
Definition
Machine learning (ML) is the scientific study of algorithms and
statistical models that computer systems use to effectively perform
a specific task without using explicit instructions, relying on
patterns and inference instead. Machine learning algorithms build a
mathematical model of sample data, known as "training data", in
order to make predictions or decisions without being explicitly
programmed to perform the task
Patterns and Inference
Relies on Patterns and Inference not explicit
instructions or Algorithms
Model Based
The training is used to train a model that is the core of
the Machine Learning Model
Not Just Glorified Statistics
We borrow many terms from statistics such as
biases, weights, models, and regressions, but that
does not make this segment of statistics, ML is a
segment of computer algorithms
Training Data
Uses some form of training data to learn from to
make its decisions
Tooling and Standards
Prebuilt Services
/ Machine Learning Based Web Services /
• Vision
• Speech
• Language
• Knowledge
• Search
24 Different Services available via REST API’s and SDK’s
No Coding Services
/ Custom Machine Learning Services without the Code /
• Targets Data Scientists and Data
Engineers
• Poorly suited to Developers
• Creates REST Services
High Level Libraries
/ Neural Network Libraries /
• Targets Developers
• Most are Python Based
• Abstracts away complexity of ML
Low Level Libraries
/ Neural Network Libraries /
• Targets Machine Learning Devs
• Most are Python Based
• Tools provided to allow easy
implementation of algorithms
Neural Network Exchange Format Open Neural Network Exchange Format
Standard
/ Not Much Competition /
Integration and Deployment
REST Services
Isolate your Machine Learning code from
other code creating appropriate
boundaries and allowing it to be
independently released.
Encapsulated
Consumers should not have to
understand the machine learning,
present them a simple RESTful interface
like any other service designed for
consumption.
Simple
Unless you can’t isolate the code. When
embedding in an IOT Edge device your
Machine Learning can not be isolated,
but can still have logical boundaries,
such as using Docker Containers.
Integrated
N-Tier
Add this service into your business logic tier. Can
also fit well into this architecture.
Others
Onion, Event Driven, CQRS, ESB, Spaghetti, etc…
Services / MicroServices
Fits well with my recommendation to wrap this in a
RESTful service, very easy to turn that into a
Service
Architecture
IAAS / PAAS
Platform As A Service is a great choice, but
Infastructure As A Service can work well also.
Containers
Fits well in a container, consider keeping your model
in one container and your service in a separate
container.
Edge Devices
Embed your Machine Learning into IOT devices,
laptops, web pages, or any other devices on the
Edge.
Containers
Training = N/A
Deployment = Azure Kubernetes Service
PAAS
Training = Azure Notebooks
Deployment = Azure Web Apps
IAAS
Training = Azure Data Science VM’s
Deployment = Azure Data Science VM’s
Examples
Azure tools that can be applied to different deployment scenarios
Versioning
21
Design for Replacement
Your team will want to iterate often, especially in the beginning. Plan your versioning strategy as though you are
swapping out you ML training daily. Don’t tightly couple your ML Service to your ML Training!
A/B
Testing
Assume you will need A/ B Testing. ML Models
that perfectly during testing can fall apart in the
real world!
Which Performs Better?
What Do I need to Version?
Standard Source Control procedures and labeling.
Cloud Storage. Each Data Set Version should be
independently accessible.
Serialize your model and check it into its own
Source Control Repo. Label Appropriately!
Serialize your training and scalars and check them
into Source Control with the Model.
Code
Data
Model
Training
Interface
Compatibility
• Standardize on Model Inputs
and Outputs.
• Consider Model Interface
changes to be a breaking
change.
• Breaking changes will require
versioning the service.
Bias and Other Risks
Bias is an error from erroneous assumptions
in the learning algorithm. High bias can cause
an algorithm to miss the relevant relations
between features and target outputs. This is
referred to as underfitting.
Bias
Source: https://en.wikipedia.org/wiki/Bias%E2%80%93variance_tradeoff
Variance is an error from sensitivity to
small fluctuations in the training set. High
variance can cause an algorithm to
model the random noise in the training
data, rather than the intended outputs.
This is referred to as overfitting.
Variance
Source: https://en.wikipedia.org/wiki/Bias%E2%80%93variance_tradeoff
Real world data is messy. Expect to have too
little data potentially causing Bias + Variance,
or missing data points in a data set. Data is
the hard part of machine learning.
Incomplete Data
My Recommendations
Don’t start by hiring a Data Scientist. There are
many tools that can help people with less expensive
skill sets. Developers, Data Analyst’s, and Data
Architects will be able to help you get started and
identify when you need a Data Scientist.
Data
Scientists
Start with off the shelf prebuilt
services. Select problems that can fit
this solutions, and get some quick
wins.
Prebuilt ML Services
Use some Neural Networks available
on GitHub. Allow your team to get
used to the language / tool chain.
Prebuilt Neural Networks
Create a Proof of Concept within your
domain leveraging High Level
Libraries to build your own Neural
Network.
POC with High Level Libraries
Use these high level libraries to
create full production systems. Don’t
bother with the low level libraries,
unless a specific need forces you to.
Full Solutions
Start Small
/ Work smart and save money /
Source: https://bit.ly/netflixjupyter/
Prebuilt
Strategy
I recommend using the Façade Design Pattern to
ensure that you are loosely coupled to the prebuilt
service you are consuming. Don’t get tightly tied into a
platform, especially when you are experimenting!
• Pick a very popular library like Keras
• Integrate into a REST Service
• Plug the REST Service into your Current
Architecture and Current Infrastructure
High Level
Library
Strategy
ONNX is a open format to represent deep
learning models. With ONNX, AI
developers can more easily move models
between state-of-the-art tools and choose
the combination that is best for them.
ONNX is developed and supported by a
community of partners.
ONNX.AI
Be Flexible
• Don’t buy a ton of GPU’s and Servers!
• Leverage the Cloud To Train
• Use Batch AI and Data Science VM’s in
Azure
Training
• Reach consensus on your versioning
strategy up front.
• Use separate containers for your ML
Service and model
• Plan for A/B Testing
• Have the ability to instantly roll forward or
backward
Up Front
Versioning
Strategy
Use a uniform versioning convention that
includes time and date.
For Example:
(datetime)-(model name)-
(model version)- (training script id).json
Versioning
Scheme
• Tough problem to solve.
• Your data is much closer to code.
• Check out off the shelf tools: Data Version
Control, Pachyderm, CookieCutter Data
Science, Luigi
Data
Version
Control
Source: https://shuaiw.github.io/2017/07/30/versioning-data-science.html
www.cameronvetter.com
Any Questions?
@poshporcupine Linkedin.com/in/cameronvetter

More Related Content

What's hot (16)

PDF
Bots & conversational AI
grojasn
 
PDF
Ml ops intro session
Avinash Patil
 
PPTX
Clean coding in plsql and sql, v2
Brendan Furey
 
PDF
Machine Learning on IBM Watson Studio
Upkar Lidder
 
PPTX
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya
 
PDF
Ml ops deployment choices
Avinash Patil
 
PPTX
Java training in bangalore
zasi besant
 
PDF
Elements of DDD with ASP.NET MVC & Entity Framework Code First
Enea Gabriel
 
PPT
Zend Framework
John Coggeshall
 
PDF
Elements of DDD with ASP.NET MVC & Entity Framework Code First v2
Enea Gabriel
 
PDF
Revolutionizing Enterprise Software Development through Continuous Delivery &...
People10 Technosoft Private Limited
 
PDF
Best PHP Frameworks
Clixlogix Technologies
 
PPTX
Open Source Data Annotation Platform for NLP, CV, Tabular, and Log Data
All Things Open
 
PPTX
Basics of Software Architecture for .NET Developers
Dan Douglas
 
PDF
Developing ML-enabled Data Pipelines on Databricks using IDE & CI/CD at Runta...
Databricks
 
PPTX
Magdalena Stenius: MLOPS Will Change Machine Learning
Lviv Startup Club
 
Bots & conversational AI
grojasn
 
Ml ops intro session
Avinash Patil
 
Clean coding in plsql and sql, v2
Brendan Furey
 
Machine Learning on IBM Watson Studio
Upkar Lidder
 
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya
 
Ml ops deployment choices
Avinash Patil
 
Java training in bangalore
zasi besant
 
Elements of DDD with ASP.NET MVC & Entity Framework Code First
Enea Gabriel
 
Zend Framework
John Coggeshall
 
Elements of DDD with ASP.NET MVC & Entity Framework Code First v2
Enea Gabriel
 
Revolutionizing Enterprise Software Development through Continuous Delivery &...
People10 Technosoft Private Limited
 
Best PHP Frameworks
Clixlogix Technologies
 
Open Source Data Annotation Platform for NLP, CV, Tabular, and Log Data
All Things Open
 
Basics of Software Architecture for .NET Developers
Dan Douglas
 
Developing ML-enabled Data Pipelines on Databricks using IDE & CI/CD at Runta...
Databricks
 
Magdalena Stenius: MLOPS Will Change Machine Learning
Lviv Startup Club
 

Similar to Integrating Machine Learning Capabilities into your team (20)

PPTX
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
PDF
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Databricks
 
PPTX
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Alberto Diaz Martin
 
PDF
Big Data Adavnced Analytics on Microsoft Azure
Mark Tabladillo
 
PPTX
AI-Plugins-Planners-Persona-SemanticKernel.pptx
Udaiappa Ramachandran
 
PPTX
DOT NET FULL STACK.pptx
shaikruhiarsha3zenco
 
PPTX
Global AI Bootcamp Madrid - Azure Databricks
Alberto Diaz Martin
 
PDF
Build, upgrade and connect your applications to the World
CLMS UK Ltd
 
PDF
Dot Net Full Stack course in madhapur, Hyderabad
neeraja0480
 
PPTX
Dot Net Full Stack course in madhapur, Hyderabad
neeraja0480
 
PPTX
No BS Guide to Deep Learning in the Enterprise
Jesus Rodriguez
 
PDF
Patterns And Practices For Infrastructure As Code With Examples In Python And...
gbartrilar
 
PPTX
Grey Matterz | Our Services | Our Solutions
matterzgrey
 
PPTX
Cloud Enablement Engine Role Definition and Mapping
Tom Laszewski
 
PPTX
Dirigible powered by Orion for Cloud Development (EclipseCon EU 2015)
Nedelcho Delchev
 
PDF
[AI] ML Operationalization with Microsoft Azure
Korkrid Akepanidtaworn
 
PDF
CLR_via_CSharp_(Jeffrey_Richter_4th_Edition).pdf
ssuserbe139c
 
PDF
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
PDF
201908 Overview of Automated ML
Mark Tabladillo
 
PPTX
Modern software architect post the agile wave
Niels Bech Nielsen
 
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Databricks
 
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Alberto Diaz Martin
 
Big Data Adavnced Analytics on Microsoft Azure
Mark Tabladillo
 
AI-Plugins-Planners-Persona-SemanticKernel.pptx
Udaiappa Ramachandran
 
DOT NET FULL STACK.pptx
shaikruhiarsha3zenco
 
Global AI Bootcamp Madrid - Azure Databricks
Alberto Diaz Martin
 
Build, upgrade and connect your applications to the World
CLMS UK Ltd
 
Dot Net Full Stack course in madhapur, Hyderabad
neeraja0480
 
Dot Net Full Stack course in madhapur, Hyderabad
neeraja0480
 
No BS Guide to Deep Learning in the Enterprise
Jesus Rodriguez
 
Patterns And Practices For Infrastructure As Code With Examples In Python And...
gbartrilar
 
Grey Matterz | Our Services | Our Solutions
matterzgrey
 
Cloud Enablement Engine Role Definition and Mapping
Tom Laszewski
 
Dirigible powered by Orion for Cloud Development (EclipseCon EU 2015)
Nedelcho Delchev
 
[AI] ML Operationalization with Microsoft Azure
Korkrid Akepanidtaworn
 
CLR_via_CSharp_(Jeffrey_Richter_4th_Edition).pdf
ssuserbe139c
 
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
201908 Overview of Automated ML
Mark Tabladillo
 
Modern software architect post the agile wave
Niels Bech Nielsen
 
Ad

More from Cameron Vetter (12)

PPTX
Why do most machine learning projects never make it to production
Cameron Vetter
 
PPTX
Ml.net machine learning for .net developers!
Cameron Vetter
 
PPTX
Cloud First Architecture
Cameron Vetter
 
PDF
Mixed reality the second generation is all about ux
Cameron Vetter
 
PPTX
Global ai night sept 2019 - Milwaukee
Cameron Vetter
 
PPTX
Azure Notebooks - Jupyter for the Cloud
Cameron Vetter
 
PPTX
An Introduction to Artificial Neural Networks
Cameron Vetter
 
PPTX
Azure Batch AI for Neural Networks
Cameron Vetter
 
PPTX
Using a Service Bus for Microservice Communication
Cameron Vetter
 
PPTX
Augmented reality for the Enterprise
Cameron Vetter
 
PPTX
Augmented Reality - Let’s Make Some Holograms! (UXD Version)
Cameron Vetter
 
PPTX
Augmented Reality - Let’s Make Some Holgrams! (Developer Version)
Cameron Vetter
 
Why do most machine learning projects never make it to production
Cameron Vetter
 
Ml.net machine learning for .net developers!
Cameron Vetter
 
Cloud First Architecture
Cameron Vetter
 
Mixed reality the second generation is all about ux
Cameron Vetter
 
Global ai night sept 2019 - Milwaukee
Cameron Vetter
 
Azure Notebooks - Jupyter for the Cloud
Cameron Vetter
 
An Introduction to Artificial Neural Networks
Cameron Vetter
 
Azure Batch AI for Neural Networks
Cameron Vetter
 
Using a Service Bus for Microservice Communication
Cameron Vetter
 
Augmented reality for the Enterprise
Cameron Vetter
 
Augmented Reality - Let’s Make Some Holograms! (UXD Version)
Cameron Vetter
 
Augmented Reality - Let’s Make Some Holgrams! (Developer Version)
Cameron Vetter
 
Ad

Recently uploaded (20)

PDF
Download iTop VPN Free 6.1.0.5882 Crack Full Activated Pre Latest 2025
imang66g
 
PDF
AWS_Agentic_AI_in_Indian_BFSI_A_Strategic_Blueprint_for_Customer.pdf
siddharthnetsavvies
 
PDF
AI Image Enhancer: Revolutionizing Visual Quality”
docmasoom
 
PDF
Enhancing Security in VAST: Towards Static Vulnerability Scanning
ESUG
 
PDF
Generating Union types w/ Static Analysis
K. Matthew Dupree
 
PDF
Infrastructure planning and resilience - Keith Hastings.pptx.pdf
Safe Software
 
PDF
Using licensed Data Loss Prevention (DLP) as a strategic proactive data secur...
Q-Advise
 
PDF
Supabase Meetup: Build in a weekend, scale to millions
Carlo Gilmar Padilla Santana
 
PDF
On Software Engineers' Productivity - Beyond Misleading Metrics
Romén Rodríguez-Gil
 
PDF
Applitools Platform Pulse: What's New and What's Coming - July 2025
Applitools
 
PPTX
Employee salary prediction using Machine learning Project template.ppt
bhanuk27082004
 
PDF
Adobe Illustrator Crack Full Download (Latest Version 2025) Pre-Activated
imang66g
 
PDF
SAP GUI Installation Guide for Windows | Step-by-Step Setup for SAP Access
SAP Vista, an A L T Z E N Company
 
PDF
ChatPharo: an Open Architecture for Understanding How to Talk Live to LLMs
ESUG
 
PDF
System Center 2025 vs. 2022; What’s new, what’s next_PDF.pdf
Q-Advise
 
PDF
10 posting ideas for community engagement with AI prompts
Pankaj Taneja
 
PDF
SAP GUI Installation Guide for macOS (iOS) | Connect to SAP Systems on Mac
SAP Vista, an A L T Z E N Company
 
PDF
Enhancing Healthcare RPM Platforms with Contextual AI Integration
Cadabra Studio
 
PDF
Summary Of Odoo 18.1 to 18.4 : The Way For Odoo 19
CandidRoot Solutions Private Limited
 
PPT
Activate_Methodology_Summary presentatio
annapureddyn
 
Download iTop VPN Free 6.1.0.5882 Crack Full Activated Pre Latest 2025
imang66g
 
AWS_Agentic_AI_in_Indian_BFSI_A_Strategic_Blueprint_for_Customer.pdf
siddharthnetsavvies
 
AI Image Enhancer: Revolutionizing Visual Quality”
docmasoom
 
Enhancing Security in VAST: Towards Static Vulnerability Scanning
ESUG
 
Generating Union types w/ Static Analysis
K. Matthew Dupree
 
Infrastructure planning and resilience - Keith Hastings.pptx.pdf
Safe Software
 
Using licensed Data Loss Prevention (DLP) as a strategic proactive data secur...
Q-Advise
 
Supabase Meetup: Build in a weekend, scale to millions
Carlo Gilmar Padilla Santana
 
On Software Engineers' Productivity - Beyond Misleading Metrics
Romén Rodríguez-Gil
 
Applitools Platform Pulse: What's New and What's Coming - July 2025
Applitools
 
Employee salary prediction using Machine learning Project template.ppt
bhanuk27082004
 
Adobe Illustrator Crack Full Download (Latest Version 2025) Pre-Activated
imang66g
 
SAP GUI Installation Guide for Windows | Step-by-Step Setup for SAP Access
SAP Vista, an A L T Z E N Company
 
ChatPharo: an Open Architecture for Understanding How to Talk Live to LLMs
ESUG
 
System Center 2025 vs. 2022; What’s new, what’s next_PDF.pdf
Q-Advise
 
10 posting ideas for community engagement with AI prompts
Pankaj Taneja
 
SAP GUI Installation Guide for macOS (iOS) | Connect to SAP Systems on Mac
SAP Vista, an A L T Z E N Company
 
Enhancing Healthcare RPM Platforms with Contextual AI Integration
Cadabra Studio
 
Summary Of Odoo 18.1 to 18.4 : The Way For Odoo 19
CandidRoot Solutions Private Limited
 
Activate_Methodology_Summary presentatio
annapureddyn
 

Integrating Machine Learning Capabilities into your team

  • 3. Cameron Vetter I have 20 years of experience using Microsoft tools and technologies to develop software. I have experience in many roles including Development, Architecture, Infrastructure, Management, and Leadership roles. I've worked for some of the largest companies in the world and for small local companies getting a breadth of experience in different Corporate Cultures. Currently, I work at SafeNet Consulting, where I get to do what I love... Architect, Design, and Develop great software! I currently focus on Microservices, SOA, Azure, Cognitive Toolkit, and Kubernetes. Principal Cloud Consultant
  • 4. A Partner to Advise and Support About SafeNet Consulting SafeNet specializes in being partners in your success. We currently focus on Custom Application Development, Cloud Consulting Services, Data & Analytics, and User Experience Strategy.
  • 5. Introduction Machine Learning Definition Tooling and Standards Integration and Deployment Versioning Bias and Ethics My Recommendations Question & Answer Agenda
  • 7. Wikipedia’s Definition Machine learning (ML) is the scientific study of algorithms and statistical models that computer systems use to effectively perform a specific task without using explicit instructions, relying on patterns and inference instead. Machine learning algorithms build a mathematical model of sample data, known as "training data", in order to make predictions or decisions without being explicitly programmed to perform the task
  • 8. Patterns and Inference Relies on Patterns and Inference not explicit instructions or Algorithms Model Based The training is used to train a model that is the core of the Machine Learning Model Not Just Glorified Statistics We borrow many terms from statistics such as biases, weights, models, and regressions, but that does not make this segment of statistics, ML is a segment of computer algorithms Training Data Uses some form of training data to learn from to make its decisions
  • 10. Prebuilt Services / Machine Learning Based Web Services / • Vision • Speech • Language • Knowledge • Search 24 Different Services available via REST API’s and SDK’s
  • 11. No Coding Services / Custom Machine Learning Services without the Code / • Targets Data Scientists and Data Engineers • Poorly suited to Developers • Creates REST Services
  • 12. High Level Libraries / Neural Network Libraries / • Targets Developers • Most are Python Based • Abstracts away complexity of ML
  • 13. Low Level Libraries / Neural Network Libraries / • Targets Machine Learning Devs • Most are Python Based • Tools provided to allow easy implementation of algorithms
  • 14. Neural Network Exchange Format Open Neural Network Exchange Format Standard / Not Much Competition /
  • 16. REST Services Isolate your Machine Learning code from other code creating appropriate boundaries and allowing it to be independently released. Encapsulated Consumers should not have to understand the machine learning, present them a simple RESTful interface like any other service designed for consumption. Simple Unless you can’t isolate the code. When embedding in an IOT Edge device your Machine Learning can not be isolated, but can still have logical boundaries, such as using Docker Containers. Integrated
  • 17. N-Tier Add this service into your business logic tier. Can also fit well into this architecture. Others Onion, Event Driven, CQRS, ESB, Spaghetti, etc… Services / MicroServices Fits well with my recommendation to wrap this in a RESTful service, very easy to turn that into a Service Architecture
  • 18. IAAS / PAAS Platform As A Service is a great choice, but Infastructure As A Service can work well also. Containers Fits well in a container, consider keeping your model in one container and your service in a separate container. Edge Devices Embed your Machine Learning into IOT devices, laptops, web pages, or any other devices on the Edge.
  • 19. Containers Training = N/A Deployment = Azure Kubernetes Service PAAS Training = Azure Notebooks Deployment = Azure Web Apps IAAS Training = Azure Data Science VM’s Deployment = Azure Data Science VM’s Examples Azure tools that can be applied to different deployment scenarios
  • 21. 21 Design for Replacement Your team will want to iterate often, especially in the beginning. Plan your versioning strategy as though you are swapping out you ML training daily. Don’t tightly couple your ML Service to your ML Training!
  • 22. A/B Testing Assume you will need A/ B Testing. ML Models that perfectly during testing can fall apart in the real world! Which Performs Better?
  • 23. What Do I need to Version? Standard Source Control procedures and labeling. Cloud Storage. Each Data Set Version should be independently accessible. Serialize your model and check it into its own Source Control Repo. Label Appropriately! Serialize your training and scalars and check them into Source Control with the Model. Code Data Model Training
  • 24. Interface Compatibility • Standardize on Model Inputs and Outputs. • Consider Model Interface changes to be a breaking change. • Breaking changes will require versioning the service.
  • 25. Bias and Other Risks
  • 26. Bias is an error from erroneous assumptions in the learning algorithm. High bias can cause an algorithm to miss the relevant relations between features and target outputs. This is referred to as underfitting. Bias Source: https://en.wikipedia.org/wiki/Bias%E2%80%93variance_tradeoff
  • 27. Variance is an error from sensitivity to small fluctuations in the training set. High variance can cause an algorithm to model the random noise in the training data, rather than the intended outputs. This is referred to as overfitting. Variance Source: https://en.wikipedia.org/wiki/Bias%E2%80%93variance_tradeoff
  • 28. Real world data is messy. Expect to have too little data potentially causing Bias + Variance, or missing data points in a data set. Data is the hard part of machine learning. Incomplete Data
  • 30. Don’t start by hiring a Data Scientist. There are many tools that can help people with less expensive skill sets. Developers, Data Analyst’s, and Data Architects will be able to help you get started and identify when you need a Data Scientist. Data Scientists
  • 31. Start with off the shelf prebuilt services. Select problems that can fit this solutions, and get some quick wins. Prebuilt ML Services Use some Neural Networks available on GitHub. Allow your team to get used to the language / tool chain. Prebuilt Neural Networks Create a Proof of Concept within your domain leveraging High Level Libraries to build your own Neural Network. POC with High Level Libraries Use these high level libraries to create full production systems. Don’t bother with the low level libraries, unless a specific need forces you to. Full Solutions Start Small / Work smart and save money / Source: https://bit.ly/netflixjupyter/
  • 32. Prebuilt Strategy I recommend using the Façade Design Pattern to ensure that you are loosely coupled to the prebuilt service you are consuming. Don’t get tightly tied into a platform, especially when you are experimenting!
  • 33. • Pick a very popular library like Keras • Integrate into a REST Service • Plug the REST Service into your Current Architecture and Current Infrastructure High Level Library Strategy
  • 34. ONNX is a open format to represent deep learning models. With ONNX, AI developers can more easily move models between state-of-the-art tools and choose the combination that is best for them. ONNX is developed and supported by a community of partners. ONNX.AI Be Flexible
  • 35. • Don’t buy a ton of GPU’s and Servers! • Leverage the Cloud To Train • Use Batch AI and Data Science VM’s in Azure Training
  • 36. • Reach consensus on your versioning strategy up front. • Use separate containers for your ML Service and model • Plan for A/B Testing • Have the ability to instantly roll forward or backward Up Front Versioning Strategy
  • 37. Use a uniform versioning convention that includes time and date. For Example: (datetime)-(model name)- (model version)- (training script id).json Versioning Scheme
  • 38. • Tough problem to solve. • Your data is much closer to code. • Check out off the shelf tools: Data Version Control, Pachyderm, CookieCutter Data Science, Luigi Data Version Control Source: https://shuaiw.github.io/2017/07/30/versioning-data-science.html