SlideShare a Scribd company logo
ONNX and ONNX Runtime
By : Vishwas Narayan
Agenda
1. Introduction to the ONNX ecosystem
2. Production Usage
3. ONNX Technical Design
ML models : Research to production
Deployment
Data
Collection
Trainin
g
Inference
and
deployment
Conversion
ML models : Research to production
Deployment
Data
Collection
Trainin
g
Inference
and
deployment
Conversion
ML models : Research to production
Deployment
Data
Collection
Trainin
g
Inference
and
deployment
Conversion
ML models : Research to production
Deployment
Data
Collection
Trainin
g
Inference
and
deployment
Conversion
Data Storage
Collect
Transform
Normalize and other steps
Data from different source
Build a Model from Data
Deploy the Model
But in this form of research
ML engineers and Data Scientist will have some overlapping role.
ML
Engineer
Data
Scientist
Many Products use Machine Learning
We train using
Deployment Targets
We Train We Deploy
ONNX
ONNX is a machine learning model representation format that is open source.
ONNX establishes a standard set of operators - the building blocks of machine
learning and deep learning models - as well as a standard file format, allowing AI
developers to utilise models with a range of frameworks, tools, runtimes, and
compilers.
ONNX is now on
● Training - CPU,GPU
● Deployment - Edge/Mobile
How do I get an ONNX model?
1. ONNX Model Zoo
How do I get an ONNX model?
1. ONNX Model Zoo
2. Model creation service such as Azure Custom Vision and
AutoML
How do I get an ONNX model?
1. ONNX Model Zoo
2. Model creation service such as Azure Custom Vision and AutoML
3. Convert model from existing model from another framework
How do I get an ONNX model?
1. ONNX Model Zoo
2. Model creation service such as Azure Custom Vision and AutoML
3. Convert model from existing model from another framework
4. End to End machine Learning Service using an Azure Machine
Learning Service
And also you can get them from
Native Export from
● Pytorch and CNTK
● Converter
https://github.com/microsoft/OLive
ML models : Research to production
Deployment
Data
Collection
Trainin
g
Inference
and
deployment
Conversion
Inferencing of the Deep Learning Model
Model
Weights
ONNX Runtime
● Available for all platforms
● Hardware Vendors do provide support
https://github.com/microsoft/DirectML
https://github.com/onnx/tutorials
We Train We Deploy
Onnx and onnx runtime
ONNX runtime is available in
Product that leverage the ONNX model
Onnx and onnx runtime
So the Stats say it
https://www.onnxruntime.ai/docs/how-to/tune-performance.html
https://github.com/onnx/tutorials
Performance Metrics
● Accuracy
● Loss
● Recall
● Latency
Common results today are
● Word - Grammar Check
● Bing - Q/A
● Cognitive service
And many more
Technical Design Principles
● Should support new architecture
● Also be supportive for the traditional ML
● Backward Compatibility
● Compact
● Cross-Platform representation for Serialization
ONNX Specification
● ONNX is a open specification that consists of the following components:
● A definition of the extensible computation graph model
● Definition of Standard Data types
● Definition of the Build operation which is for the versioned operation
sert(Schema)
ONNX model File Format
Model
a. Version Info
b. Metadata
c. Acyclic Computation Data Flow Graph
ONNX model File Format
Graph
a. Input and output units
b. List of computational Node
c. Graph Name
ONNX model File Format
Computational Node
a. Zero to more input types of the design types
b. One or more output defined types
c. Operators
d. Operator Parameter
ONNX Supported Data Types
Tensor Data Type
● Int8,int16,int32,int64
● Uint8,uint16,uint32,uint64
● float16,float32,float64,float,double
● Bool
● String
● complex 64,complex128
ONNX Supported Data Types
Non Tensor Data TYpe
● Sequence
● Map
Operator in ONNX
https://github.com/onnx/onnx/blob/master/docs/Operators.md
● An operator is identified by <name,domain,version>
● Core ops (ONNX and ONNX-ML)
○ Should be supported by ONNX-compatible products
○ Generally cannot be meaningfully further decomposed
○ Currently 124 ops in ai.onnx domain and 18 in ai.onnx. ml
○ Supports many scenarios/problem areas including image classification,
recommendation,natural language processing,etc.
ONNX - Custom Ops
● Ops Specific to framework or runtime
● Indicated by custom domain name
● Primarily meant to be a safety valve
ONNX versioning
Done in 3 Levels
● IR version (file format) :Currenly at version %
● Opset Version ONNX Model declare which operator sets they require as a list
of the two-part-operator ids(domain,opset_version)
● Operator Verion: A given Operator is identified by a three
tuple(domain,Op_type,Op_version)
https://github.com/onnx/onnx/blob/master/docs/Versioning.md
And also the version do change
ONNX - Design Principles
● Provide complete implementation of the ONNX standard — implement all
versions of the operators (since opset 7)
● Backward compatibility
● High performance
● Cross platform
● Leverage custom accelerators and runtimes to enable maximum performance
(execution providers)
● Support hybrid execution of the models
● Extensible through pluggable modules
Onnx and onnx runtime
https://www.onnxruntime.ai/docs/resources/high-level-design.html
https://www.onnxruntime.ai/docs/resources/high-level-design.html
Graph Partitioning
Given a mutable graph, graph partitioner assigns graph nodes to each execution provider per their
capability and idea goal is to reach best performance in a heterogeneous environment.
ONNX RUNTIME uses a "greedy" node assignment mechanism
● Users specify a preferred execution provider list in order
● ONNX RUNTIME will go thru the list in order to check each provider's capability and assign nodes to
it if it can run the nodes.
FUTURE:
● Manual tuned partitioning
● ML based partitioning
Rewrite rule
An interface created for finding patterns (with specific nodes) and applying
rewriting rules against a sub-graph.
Graph Transformer
An interface created for applying graph transformation with full graph editing
capability.
Transformer Level 0
Transformers anyway will be applied after graph partitioning (e.g. cast insertion,
mem copy insertion)
Level 1: General transformers not specific to any specific execution provider (e.g.
drop out elimination)
Level 2: Execution provider specific transformers (e.g. transpose insertion for
FPGA)
Graph Optimizations
Level 0
Cast
MemCopy
https://gitee.com/arnoldfychen/onnxruntime/blob/master/docs/ONNX_Runtime_Pe
rf_Tuning.md
Graph Optimizations
Level 1
● Eliminateldentity
● EliminateSlice
● UnsqueezeElimination
● EliminateDropout
● FuseReluClip
● ShapeTolnitializer
● ConyAddFusion
● ConvMulFusion
● ConvBNFusion
Execution Provider
A hardware accelerator interface to query its capability and get corresponding executables.
● Kernel based execution providers
These execution providers provides implementations of operators defined in ONNX (e.g.
CPUExecutionProvider, CudaExecutionProvider, MKLDNNExecutionProvider, etc.)
● Runtime based execution providers
These execution providers may not have implementations with the granularity of ONNX ops, but it
can run whole or partial ONNX graph. Say, it can run several ONNX ops (a sub-graph) together
with one function it has (e.g. TensorRTExecutionProvider, nGraphExecutionProvider, etc.)
https://www.onnxruntime.ai/docs/reference/execution-providers/
Extending ONNX runtime
● Execution providers
○ Implement the lExecution Provider interface
■ Examples: TensorRT, OpenVino, NGraph, Android NNAPI, etc
● Custom operators
○ Support operators outside the ONNX standard
○ Support for writing custom ops in both C/C++ and Python
● Graph optimizers
○ Implement the Graph Transformer interface
ONNX Go Live(OLive)
● Automates the process of ONNX model shipping
● Integrates
○ model conversion
○ correctness test
○ performance tuning
into a single pipeline and outputs a production ready ONNX model with ONNX Runtime configurations
(execution provider + optimization options)
https://github.com/microsoft/OLive
So to know more you have to go to this link here
https://azure.microsoft.com/en-in/blog/onnx-runtime-is-now-open-source/
https://www.onnxruntime.ai/python/tutorial.html
Thank you

More Related Content

What's hot (20)

PPTX
End-to-End Deep Learning Deployment with ONNX
Nick Pentreath
 
PDF
An introduction to computer vision with Hugging Face
Julien SIMON
 
PDF
Generative adversarial networks
남주 김
 
PPTX
Fine tune and deploy Hugging Face NLP models
OVHcloud
 
PPTX
BERT
Khang Pham
 
PDF
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
Databricks
 
PPTX
Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Lalit Jain
 
PDF
Introduction to Khronos SYCL
Min-Yih Hsu
 
PPTX
Introduction to Deep Learning
Oswald Campesato
 
PDF
Transformer Introduction (Seminar Material)
Yuta Niki
 
PPTX
Scikit Learn intro
9xdot
 
PPTX
Introduction to Transformer Model
Nuwan Sriyantha Bandara
 
PDF
Introduction To Generative Adversarial Networks GANs
Hichem Felouat
 
PDF
GAN in medical imaging
Cheng-Bin Jin
 
PPTX
Pytorch
ehsan tr
 
PPTX
Convolutional Neural Network (CNN)
Muhammad Haroon
 
PPTX
HuggingFace AI - Hugging Face lets users create interactive, in-browser demos...
Bluechip Technologies
 
PDF
Reinventing Deep Learning
 with Hugging Face Transformers
Julien SIMON
 
PPTX
Generative models
Birger Moell
 
PPTX
[AIoTLab]attention mechanism.pptx
TuCaoMinh2
 
End-to-End Deep Learning Deployment with ONNX
Nick Pentreath
 
An introduction to computer vision with Hugging Face
Julien SIMON
 
Generative adversarial networks
남주 김
 
Fine tune and deploy Hugging Face NLP models
OVHcloud
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
Databricks
 
Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Lalit Jain
 
Introduction to Khronos SYCL
Min-Yih Hsu
 
Introduction to Deep Learning
Oswald Campesato
 
Transformer Introduction (Seminar Material)
Yuta Niki
 
Scikit Learn intro
9xdot
 
Introduction to Transformer Model
Nuwan Sriyantha Bandara
 
Introduction To Generative Adversarial Networks GANs
Hichem Felouat
 
GAN in medical imaging
Cheng-Bin Jin
 
Pytorch
ehsan tr
 
Convolutional Neural Network (CNN)
Muhammad Haroon
 
HuggingFace AI - Hugging Face lets users create interactive, in-browser demos...
Bluechip Technologies
 
Reinventing Deep Learning
 with Hugging Face Transformers
Julien SIMON
 
Generative models
Birger Moell
 
[AIoTLab]attention mechanism.pptx
TuCaoMinh2
 

Similar to Onnx and onnx runtime (20)

PDF
Linux-Internals-and-Networking
Emertxe Information Technologies Pvt Ltd
 
PDF
BKK16-106 ODP Project Update
Linaro
 
PDF
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Fwdays
 
PDF
linux_internals_2.3 (1).pdf àaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
YasaswiniChintamalla1
 
PPTX
Onnc intro
Luba Tang
 
PPTX
Explore asp.net core 3.0 features
iFour Technolab Pvt. Ltd.
 
PDF
OpenDataPlane Testing in Travis
Dmitry Baryshkov
 
PDF
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
Edge AI and Vision Alliance
 
PDF
EclipseCon Eu 2015 - Breathe life into your Designer!
melbats
 
PDF
Heterogeneous multiprocessing on androd and i.mx7
Kynetics
 
PDF
.NET Core, ASP.NET Core Course, Session 3
Amin Mesbahi
 
PPTX
Hands on OpenCL
Vladimir Starostenkov
 
PDF
"New Standards for Embedded Vision and Neural Networks," a Presentation from ...
Edge AI and Vision Alliance
 
PDF
Testing kubernetes and_open_shift_at_scale_20170209
mffiedler
 
PDF
The Fn Project: A Quick Introduction (December 2017)
Oracle Developers
 
PDF
Learn more about the tremendous value Open Data Plane brings to NFV
Ghodhbane Mohamed Amine
 
PDF
20141111_SOS3_Gallo
Andrea Gallo
 
PDF
TSC Sponsored BoF: Can Linux and Automotive Functional Safety Mix ? Take 2: T...
Linaro
 
PDF
Sudha Madhuri Yagnamurthy Resume 2 (5)
Sudha Madhuri Yagnamurthy
 
PDF
SFO15-102:ODP Project Update
Linaro
 
Linux-Internals-and-Networking
Emertxe Information Technologies Pvt Ltd
 
BKK16-106 ODP Project Update
Linaro
 
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Fwdays
 
linux_internals_2.3 (1).pdf àaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
YasaswiniChintamalla1
 
Onnc intro
Luba Tang
 
Explore asp.net core 3.0 features
iFour Technolab Pvt. Ltd.
 
OpenDataPlane Testing in Travis
Dmitry Baryshkov
 
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
Edge AI and Vision Alliance
 
EclipseCon Eu 2015 - Breathe life into your Designer!
melbats
 
Heterogeneous multiprocessing on androd and i.mx7
Kynetics
 
.NET Core, ASP.NET Core Course, Session 3
Amin Mesbahi
 
Hands on OpenCL
Vladimir Starostenkov
 
"New Standards for Embedded Vision and Neural Networks," a Presentation from ...
Edge AI and Vision Alliance
 
Testing kubernetes and_open_shift_at_scale_20170209
mffiedler
 
The Fn Project: A Quick Introduction (December 2017)
Oracle Developers
 
Learn more about the tremendous value Open Data Plane brings to NFV
Ghodhbane Mohamed Amine
 
20141111_SOS3_Gallo
Andrea Gallo
 
TSC Sponsored BoF: Can Linux and Automotive Functional Safety Mix ? Take 2: T...
Linaro
 
Sudha Madhuri Yagnamurthy Resume 2 (5)
Sudha Madhuri Yagnamurthy
 
SFO15-102:ODP Project Update
Linaro
 
Ad

More from Vishwas N (20)

PDF
API Testing and Hacking.pdf
Vishwas N
 
PDF
API Hijacking.pdf
Vishwas N
 
PDF
What should be your approach for solving ML_CV problem statements_.pdf
Vishwas N
 
PDF
Deepfence.pdf
Vishwas N
 
PDF
DevOps - A Purpose for an Institution.pdf
Vishwas N
 
PDF
API Testing and Hacking (1).pdf
Vishwas N
 
PDF
API Hijacking (1).pdf
Vishwas N
 
PDF
Dapr.pdf
Vishwas N
 
PDF
linkerd.pdf
Vishwas N
 
PDF
HoloLens.pdf
Vishwas N
 
PDF
Automated Governance for the DevOps Institutions.pdf
Vishwas N
 
PDF
Lets build with DevSecOps Culture.pdf
Vishwas N
 
PDF
Github Actions and Terraform.pdf
Vishwas N
 
PDF
KEDA.pdf
Vishwas N
 
PPTX
Ram bleed the hardware based approach for the hackers
Vishwas N
 
PPTX
Container on azure
Vishwas N
 
PPTX
Deeplearning and dev ops azure
Vishwas N
 
PPTX
Azure data lakes
Vishwas N
 
PPTX
Azure dev ops
Vishwas N
 
PPTX
Azure ai on premises with docker
Vishwas N
 
API Testing and Hacking.pdf
Vishwas N
 
API Hijacking.pdf
Vishwas N
 
What should be your approach for solving ML_CV problem statements_.pdf
Vishwas N
 
Deepfence.pdf
Vishwas N
 
DevOps - A Purpose for an Institution.pdf
Vishwas N
 
API Testing and Hacking (1).pdf
Vishwas N
 
API Hijacking (1).pdf
Vishwas N
 
Dapr.pdf
Vishwas N
 
linkerd.pdf
Vishwas N
 
HoloLens.pdf
Vishwas N
 
Automated Governance for the DevOps Institutions.pdf
Vishwas N
 
Lets build with DevSecOps Culture.pdf
Vishwas N
 
Github Actions and Terraform.pdf
Vishwas N
 
KEDA.pdf
Vishwas N
 
Ram bleed the hardware based approach for the hackers
Vishwas N
 
Container on azure
Vishwas N
 
Deeplearning and dev ops azure
Vishwas N
 
Azure data lakes
Vishwas N
 
Azure dev ops
Vishwas N
 
Azure ai on premises with docker
Vishwas N
 
Ad

Recently uploaded (20)

PDF
Build Fast, Scale Faster: Milvus vs. Zilliz Cloud for Production-Ready AI
Zilliz
 
PPTX
一比一原版(LaTech毕业证)路易斯安那理工大学毕业证如何办理
Taqyea
 
PPTX
原版西班牙莱昂大学毕业证(León毕业证书)如何办理
Taqyea
 
PPT
Agilent Optoelectronic Solutions for Mobile Application
andreashenniger2
 
PDF
Apple_Environmental_Progress_Report_2025.pdf
yiukwong
 
PPTX
Lec15_Mutability Immutability-converted.pptx
khanjahanzaib1
 
PPTX
PE introd.pptxfrgfgfdgfdgfgrtretrt44t444
nepmithibai2024
 
PPTX
Optimization_Techniques_ML_Presentation.pptx
farispalayi
 
PPT
introduction to networking with basics coverage
RamananMuthukrishnan
 
PPTX
L1A Season 1 Guide made by A hegy Eng Grammar fixed
toszolder91
 
PDF
The-Hidden-Dangers-of-Skipping-Penetration-Testing.pdf.pdf
naksh4thra
 
PPTX
PM200.pptxghjgfhjghjghjghjghjghjghjghjghjghj
breadpaan921
 
PPT
Computer Securityyyyyyyy - Chapter 1.ppt
SolomonSB
 
PPT
introductio to computers by arthur janry
RamananMuthukrishnan
 
PPTX
sajflsajfljsdfljslfjslfsdfas;fdsfksadfjlsdflkjslgfs;lfjlsajfl;sajfasfd.pptx
theknightme
 
PPT
Computer Securityyyyyyyy - Chapter 2.ppt
SolomonSB
 
PDF
Azure_DevOps introduction for CI/CD and Agile
henrymails
 
PPTX
一比一原版(SUNY-Albany毕业证)纽约州立大学奥尔巴尼分校毕业证如何办理
Taqyea
 
PPTX
英国假毕业证诺森比亚大学成绩单GPA修改UNN学生卡网上可查学历成绩单
Taqyea
 
DOCX
Custom vs. Off-the-Shelf Banking Software
KristenCarter35
 
Build Fast, Scale Faster: Milvus vs. Zilliz Cloud for Production-Ready AI
Zilliz
 
一比一原版(LaTech毕业证)路易斯安那理工大学毕业证如何办理
Taqyea
 
原版西班牙莱昂大学毕业证(León毕业证书)如何办理
Taqyea
 
Agilent Optoelectronic Solutions for Mobile Application
andreashenniger2
 
Apple_Environmental_Progress_Report_2025.pdf
yiukwong
 
Lec15_Mutability Immutability-converted.pptx
khanjahanzaib1
 
PE introd.pptxfrgfgfdgfdgfgrtretrt44t444
nepmithibai2024
 
Optimization_Techniques_ML_Presentation.pptx
farispalayi
 
introduction to networking with basics coverage
RamananMuthukrishnan
 
L1A Season 1 Guide made by A hegy Eng Grammar fixed
toszolder91
 
The-Hidden-Dangers-of-Skipping-Penetration-Testing.pdf.pdf
naksh4thra
 
PM200.pptxghjgfhjghjghjghjghjghjghjghjghjghj
breadpaan921
 
Computer Securityyyyyyyy - Chapter 1.ppt
SolomonSB
 
introductio to computers by arthur janry
RamananMuthukrishnan
 
sajflsajfljsdfljslfjslfsdfas;fdsfksadfjlsdflkjslgfs;lfjlsajfl;sajfasfd.pptx
theknightme
 
Computer Securityyyyyyyy - Chapter 2.ppt
SolomonSB
 
Azure_DevOps introduction for CI/CD and Agile
henrymails
 
一比一原版(SUNY-Albany毕业证)纽约州立大学奥尔巴尼分校毕业证如何办理
Taqyea
 
英国假毕业证诺森比亚大学成绩单GPA修改UNN学生卡网上可查学历成绩单
Taqyea
 
Custom vs. Off-the-Shelf Banking Software
KristenCarter35
 

Onnx and onnx runtime

  • 1. ONNX and ONNX Runtime By : Vishwas Narayan
  • 2. Agenda 1. Introduction to the ONNX ecosystem 2. Production Usage 3. ONNX Technical Design
  • 3. ML models : Research to production Deployment Data Collection Trainin g Inference and deployment Conversion
  • 4. ML models : Research to production Deployment Data Collection Trainin g Inference and deployment Conversion
  • 5. ML models : Research to production Deployment Data Collection Trainin g Inference and deployment Conversion
  • 6. ML models : Research to production Deployment Data Collection Trainin g Inference and deployment Conversion
  • 7. Data Storage Collect Transform Normalize and other steps Data from different source Build a Model from Data Deploy the Model
  • 8. But in this form of research ML engineers and Data Scientist will have some overlapping role.
  • 10. Many Products use Machine Learning
  • 13. We Train We Deploy
  • 14. ONNX ONNX is a machine learning model representation format that is open source. ONNX establishes a standard set of operators - the building blocks of machine learning and deep learning models - as well as a standard file format, allowing AI developers to utilise models with a range of frameworks, tools, runtimes, and compilers.
  • 15. ONNX is now on ● Training - CPU,GPU ● Deployment - Edge/Mobile
  • 16. How do I get an ONNX model? 1. ONNX Model Zoo
  • 17. How do I get an ONNX model? 1. ONNX Model Zoo 2. Model creation service such as Azure Custom Vision and AutoML
  • 18. How do I get an ONNX model? 1. ONNX Model Zoo 2. Model creation service such as Azure Custom Vision and AutoML 3. Convert model from existing model from another framework
  • 19. How do I get an ONNX model? 1. ONNX Model Zoo 2. Model creation service such as Azure Custom Vision and AutoML 3. Convert model from existing model from another framework 4. End to End machine Learning Service using an Azure Machine Learning Service
  • 20. And also you can get them from Native Export from ● Pytorch and CNTK ● Converter https://github.com/microsoft/OLive
  • 21. ML models : Research to production Deployment Data Collection Trainin g Inference and deployment Conversion
  • 22. Inferencing of the Deep Learning Model Model Weights
  • 23. ONNX Runtime ● Available for all platforms ● Hardware Vendors do provide support https://github.com/microsoft/DirectML https://github.com/onnx/tutorials
  • 24. We Train We Deploy
  • 26. ONNX runtime is available in
  • 27. Product that leverage the ONNX model
  • 29. So the Stats say it https://www.onnxruntime.ai/docs/how-to/tune-performance.html https://github.com/onnx/tutorials
  • 30. Performance Metrics ● Accuracy ● Loss ● Recall ● Latency
  • 31. Common results today are ● Word - Grammar Check ● Bing - Q/A ● Cognitive service And many more
  • 32. Technical Design Principles ● Should support new architecture ● Also be supportive for the traditional ML ● Backward Compatibility ● Compact ● Cross-Platform representation for Serialization
  • 33. ONNX Specification ● ONNX is a open specification that consists of the following components: ● A definition of the extensible computation graph model ● Definition of Standard Data types ● Definition of the Build operation which is for the versioned operation sert(Schema)
  • 34. ONNX model File Format Model a. Version Info b. Metadata c. Acyclic Computation Data Flow Graph
  • 35. ONNX model File Format Graph a. Input and output units b. List of computational Node c. Graph Name
  • 36. ONNX model File Format Computational Node a. Zero to more input types of the design types b. One or more output defined types c. Operators d. Operator Parameter
  • 37. ONNX Supported Data Types Tensor Data Type ● Int8,int16,int32,int64 ● Uint8,uint16,uint32,uint64 ● float16,float32,float64,float,double ● Bool ● String ● complex 64,complex128
  • 38. ONNX Supported Data Types Non Tensor Data TYpe ● Sequence ● Map
  • 39. Operator in ONNX https://github.com/onnx/onnx/blob/master/docs/Operators.md ● An operator is identified by <name,domain,version> ● Core ops (ONNX and ONNX-ML) ○ Should be supported by ONNX-compatible products ○ Generally cannot be meaningfully further decomposed ○ Currently 124 ops in ai.onnx domain and 18 in ai.onnx. ml ○ Supports many scenarios/problem areas including image classification, recommendation,natural language processing,etc.
  • 40. ONNX - Custom Ops ● Ops Specific to framework or runtime ● Indicated by custom domain name ● Primarily meant to be a safety valve
  • 41. ONNX versioning Done in 3 Levels ● IR version (file format) :Currenly at version % ● Opset Version ONNX Model declare which operator sets they require as a list of the two-part-operator ids(domain,opset_version) ● Operator Verion: A given Operator is identified by a three tuple(domain,Op_type,Op_version) https://github.com/onnx/onnx/blob/master/docs/Versioning.md And also the version do change
  • 42. ONNX - Design Principles ● Provide complete implementation of the ONNX standard — implement all versions of the operators (since opset 7) ● Backward compatibility ● High performance ● Cross platform ● Leverage custom accelerators and runtimes to enable maximum performance (execution providers) ● Support hybrid execution of the models ● Extensible through pluggable modules
  • 46. Graph Partitioning Given a mutable graph, graph partitioner assigns graph nodes to each execution provider per their capability and idea goal is to reach best performance in a heterogeneous environment. ONNX RUNTIME uses a "greedy" node assignment mechanism ● Users specify a preferred execution provider list in order ● ONNX RUNTIME will go thru the list in order to check each provider's capability and assign nodes to it if it can run the nodes. FUTURE: ● Manual tuned partitioning ● ML based partitioning
  • 47. Rewrite rule An interface created for finding patterns (with specific nodes) and applying rewriting rules against a sub-graph.
  • 48. Graph Transformer An interface created for applying graph transformation with full graph editing capability.
  • 49. Transformer Level 0 Transformers anyway will be applied after graph partitioning (e.g. cast insertion, mem copy insertion) Level 1: General transformers not specific to any specific execution provider (e.g. drop out elimination) Level 2: Execution provider specific transformers (e.g. transpose insertion for FPGA)
  • 51. Graph Optimizations Level 1 ● Eliminateldentity ● EliminateSlice ● UnsqueezeElimination ● EliminateDropout ● FuseReluClip ● ShapeTolnitializer ● ConyAddFusion ● ConvMulFusion ● ConvBNFusion
  • 52. Execution Provider A hardware accelerator interface to query its capability and get corresponding executables. ● Kernel based execution providers These execution providers provides implementations of operators defined in ONNX (e.g. CPUExecutionProvider, CudaExecutionProvider, MKLDNNExecutionProvider, etc.) ● Runtime based execution providers These execution providers may not have implementations with the granularity of ONNX ops, but it can run whole or partial ONNX graph. Say, it can run several ONNX ops (a sub-graph) together with one function it has (e.g. TensorRTExecutionProvider, nGraphExecutionProvider, etc.)
  • 54. Extending ONNX runtime ● Execution providers ○ Implement the lExecution Provider interface ■ Examples: TensorRT, OpenVino, NGraph, Android NNAPI, etc ● Custom operators ○ Support operators outside the ONNX standard ○ Support for writing custom ops in both C/C++ and Python ● Graph optimizers ○ Implement the Graph Transformer interface
  • 55. ONNX Go Live(OLive) ● Automates the process of ONNX model shipping ● Integrates ○ model conversion ○ correctness test ○ performance tuning into a single pipeline and outputs a production ready ONNX model with ONNX Runtime configurations (execution provider + optimization options) https://github.com/microsoft/OLive
  • 56. So to know more you have to go to this link here https://azure.microsoft.com/en-in/blog/onnx-runtime-is-now-open-source/ https://www.onnxruntime.ai/python/tutorial.html