Deploying End-to-End Deep Learning Pipelines with ONNX

Deploying end-to-end
deep learning
pipelines with ONNX
—
Nick Pentreath
Principal Engineer
@MLnick

About
IBM Developer / © 2019 IBM Corporation
– @MLnick on Twitter & Github
– Principal Engineer, IBM CODAIT (Center for
Open-Source Data & AI Technologies)
– Machine Learning & AI
– Apache Spark committer & PMC
– Author of Machine Learning with Spark
– Various conferences & meetups
2

CODAIT
Improving the Enterprise AI Lifecycle in Open Source
Center for Open Source
Data & AI Technologies
IBM Developer / © 2019 IBM Corporation 3
CODAIT aims to make AI solutions dramatically
easier to create, deploy, and manage in the
enterprise.
We contribute to and advocate for the open-source
technologies that are foundational to IBM’s AI
offerings.
30+ open-source developers!

The Machine Learning
Workflow

Perception

In reality the
workflow spans teams …

… and tools …

… and is a small (but critical!)
piece of the puzzle
*Source: Hidden Technical Debt in Machine Learning Systems

Machine Learning
Deployment

What, Where, How?
– What are you deploying?
• What is a “model”
– Where are you deploying?
• Target environment (cloud, browser, edge)
• Batch, streaming, real-time?
– How are you deploying?
• “devops” deployment mechanism
• Serving framework
We will talk mostly about the what

What is a “model”?

Deep Learning doesn’t need
feature engineering or data
processing …
right?

Deep learning
pipeline?
Source: https://ai.googleblog.com/2016/03/train-your-own-image-classifier-with.html
beagle: 0.82
Input image Inference Prediction

Deep learning
pipeline!
beagle: 0.82
basset: 0.09
bluetick: 0.07
...
Input image Image pre-processing Prediction
Decode image
Resize
Normalization
Convert types / format
Inference Post-processing
[0.2, 0.3, … ]
(label, prob)
Sort
Label map
PIL, OpenCV, tf.image,
…
Custom
Python
* Logos trademarks of their respective projects

Image pre-processing
Decode image
Resize
Normalization
vs
Color mode cat: 0.45
beagle: 0.34
Decoding lib
RGB BGR
PIL vs OpenCV vs tf.image vs skimage
libJPEG vs OpenCV
INTEGER_FAST vs INTEGER_ACCURATE
Data layout
NCHW vs NHWC

Image pre-processing
Decode image
Resize
Normalization
Operation order
Normalize INT pixels (128) Convert Float32
Convert Float32 Normalize float (0.5)

Inference post-
processing
Convert to numpy array
[0.2, 0.3, … ]
(label, prob)
Sort
Label map
Custom loading label mapping / vocab / …
TF SavedModel (assets)
Keras decode_predictions
Custom code
Custom code

Pipelines, not Models
– Deploying just the model part of the
workflow is not enough
– Entire pipeline must be deployed
• Data transforms
• Feature extraction & pre-processing
• DL / ML model
• Prediction transformation
– Even ETL is part of the pipeline!
– Pipelines in frameworks
• scikit-learn
• Spark ML pipelines
• TensorFlow Transform
• pipeliner (R)

Challenges
– Formats
• Each framework does things differently
• Proprietary formats: lock-in, not portable
– Lack of standardization leads to custom
solutions and extensions
– Need to manage and bridge many different:
• Languages - Python, R, Notebooks, Scala / Java / C
• Frameworks – too many to count!
• Dependencies
• Versions
– Performance characteristics can be highly
variable across these dimensions
– Friction between teams
• Data scientists & researchers – latest & greatest
• Production – stability, control, minimize changes,
performance
• Business – metrics, business impact, product must
always work!
* Logos trademarks of their respective projects
19

Containers for ML
Deployment

Containers are “The Solution”
… right?
– But …
• What goes in the container is still the most
important factor
• Performance can be highly variable across
language, framework, version
• Requires devops knowledge, CI / deployment
pipelines, good practices
• Does not solve the issue of standardization
• Formats
• APIs exposed
• A serving framework is still required on top
– Container-based deployment has
significant benefits
• Repeatability
• Ease of configuration
• Separation of concerns – focus on what, not
how
• Allow data scientists & researchers to use their
language / framework of choice
• Container frameworks take care of (certain)
monitoring, fault tolerance, HA, etc.

Open Standards for
Model Serialization &
Deployment

Why a standard?
Standard
Format
Execution
Optimization
Tooling
(Viz, analysis, …)
Single stack

Why an Open Standard?
– Open-source vs open standard
– Open source (license) is only one
aspect
• OSS licensing allows free use, modification
• Inspect the code etc
• … but may not have any control
– Open governance is critical
• Avoid concentration of control (typically by large
companies, vendors)
• Visibility of development processes, strategic
planning, roadmaps
– However there are downsides
• Standard needs wide adoption and critical mass
to succeed
• A standard can move slowly in terms of new
features, fixes and enhancements
• Design by committee
• Keeping up with pace of framework development

Open Neural Network Exchange
(ONNX)
– Championed by Facebook & Microsoft
– Protobuf for serialization format and type
specification
– Describes
• computation graph (inputs, outputs, operators) - DAG
• values (weights)
– In this way the serialized graph is “self-
contained”
– Focused on Deep Learning / tensor operations
– Baked into PyTorch from 1.0.0 / Caffe2 as the
serialization & interchange format

ONNX Graphs
matmult/Mul (op#0)
input0 X
input1 Y
output0 Z
X
Z
Y
Source: http://onnx.ai/sklearn-onnx/auto_examples/plot_pipeline.html
graph {
node {
input: "X"
input: "Y"
output: "Z"
name: "matmult"
op_type: "Mul"
}
input {
name: "X"
type { ... }
}
output {
name: "Z"
type { ... }
}
}

ONNX Graphs
Source: https://github.com/onnx/tutorials/blob/master/tutorials/VisualizingAModel.md
SqueezeNet Graph Visualization

ONNX-ML
– Provides support for (parts of)
“traditional” machine learning
• Additional types
– sequences
– maps
• Operators
• Vectorizers (numeric & string data)
• One hot encoding, label encoding
• Scalers (normalization, scaling)
• Models (linear, SVM, TreeEnsemble)
• …
https://github.com/onnx/onnx/blob/master/docs/Operators-ml.md

ONNX-ML
– Exporter support
• Scikit-learn – 60+
• LightGBM
• XGBoost
• Apache Spark ML – 25+
• Keras – all layers + TF custom layers
• Libsvm
• Apple CoreML
https://github.com/onnx/onnxmltools/
http://onnx.ai/sklearn-onnx/index.html
https://github.com/onnx/keras-onnx

ONNX-ML for Apache Spark
– Exporter support
• Linear models, DT, RF, GBT,
NaiveBayes, OneVsRest
• Scalers, Imputer, Binarizer,
Bucketizer
• String indexing, Stop words
• OneHotEncoding, Feature
selection / slicing
• PCA, Word2Vec, LSH
https://github.com/onnx/onnxmltools/tree/master/onnxmltools/convert/sparkml
initial_types = [
("label", StringTensorType([1, 1])),
...
]
pipeline_model = pipeline.fit(training_data)
onnx_model = convert_sparkml(pipeline_model,
..., initial_types)

ONNX-ML for Apache Spark
– Missing exporters
• Feature hashing, TFIDF
• RFormula
• NGram
• SQLTransformer
• Models – clustering, FP, ALS
– Current issues
• Tokenizer - supported but not in ONNX spec
(custom operator)
• Limited invalid data handling
• Python / PySpark only

ONNX Ecosystem
Other compliant
runtimes
Single stack
Network visualization
Converters ONNX Spec
ONNX
Model Zoo

ONNX Governance
– Move towards open governance
model
• Multiple vendors
• High level steering committee
• Special Interest Groups (SIGs)
– Converters
– Training
– Pipelines
– Model zoos
• Working groups
https://github.com/onnx/onnx/tree/master/community
– However ... not in a foundation

ONNX Missing Pieces
– ONNX
• Operator / converter coverage
– E.g. TensorFlow coverage
• Image processing
– Support for basic resize, crop
– No support for reading image directly (like
tf.image.decode_jpeg)
• String processing
• Comprehensive benchmarks
– ONNX-ML
• Types
– datetime
• Operators
– String processing / NLP – e.g. tokenization
– Hashing
– Clustering models
• Specific exporters
– Apache Spark ML – python only
– Very basic tokenization in sklearn
– No support for Keras tokenizer
• Combining frameworks
– Still ad-hoc, requires custom code

Summary
ONNX
! !
• Backing by large industry
players
• Growing rapidly with lots
of momentum
• Open governance model
• Focused on deep learning
operators
• ONNX-ML provides some
support for ”traditional”
ML and feature processing
• Still relatively new
• Difficult to keep up with
breadth and depth of
framework evolution
• Still work required for
feature processing and
other data types (strings,
datetime, etc)
• Limited image & text pre-
processing

Conclusion
– However there are risks
• ONNX still relatively young
• Operator / framework coverage
• Limitations of the standard
• Can one standard encompass all requirements &
use cases?
– Open standard for serialization and
deployment of deep learning pipelines
• True portability across languages, frameworks,
runtimes and versions
• Execution environment independent of the producer
• One execution stack
– Solves a significant pain point for the
deployment of ML pipelines in a truly
open manner
Get involved - it’s open source, (open governance)!
https://onnx.ai/

Thank you
Sign up for IBM Cloud and try Watson Studio: https://ibm.biz/BdznGk
codait.org
twitter.com/MLnick
github.com/MLnick
developer.ibm.com
37

Deploying End-to-End Deep Learning Pipelines with ONNX

More Related Content

What's hot (20)

Similar to Deploying End-to-End Deep Learning Pipelines with ONNX (20)

More from Databricks (20)

Recently uploaded (20)

Deploying End-to-End Deep Learning Pipelines with ONNX