SlideShare a Scribd company logo
Large-Scale AI with Azure Container Service
Large-Scale AI with Azure Container Service
Large-Scale AI with Azure Container Service
Automated Software Development in Heterogeneous
GPU/CPU Environments for Seismic Modeling
2 x NVIDIA Tesla X2070
2 x (512 CORES 6144 MBMEMORY SIZE)
...the software system that
orchestrates the whole
thing ... is called Borg, and
it’s one of the best-kept
secrets of Google’s rapid
evolution into the most
dominant force on the web
Large-Scale AI with Azure Container Service
Large-Scale AI with Azure Container Service
Large-Scale AI with Azure Container Service
Azure N
Series
2496 x2 CORES
12288 MB x2 MEMORY
• No need to deal with IT.
• Single entry point for the team to perform
experiments.
• Scalability, based on the demands of training.
• Handling production SLAs for trained models.
Large-Scale AI with Azure Container Service
Large-Scale AI with Azure Container Service
ACS Engine ACS AKS
Open Source
Community and innovation
Managed by Azure
Choice: Swarm, Mesos, Kubernetes
Managed Kubernetes
Horizontal Pod Autoscaler (HPA)
kubectl autoscale deployment foo
--min=4 --max=6 --cpu-percent=80
Node-level Autoscaling
Large-Scale AI with Azure Container Service
Large-Scale AI with Azure Container Service
• https://github.com/Microsoft/CNTK
Large-Scale AI with Azure Container Service
1784 samples/s
1709 samples/s
Azure NC6 Virtual Machine Azure NC6 Virtual Machine Node
CNTK training using
CIFAR-10 (50,000
training images and
10,000 test images)
Large-Scale AI with Azure Container Service
Large-Scale AI with Azure Container Service
BatchSize Caffe CNTK MXNET TensorFlo
w
Torch
86 339.9 265.2 274.7 882.7 358.0
128 327.9 236.9 245.2 853.0 335.8
256 311.4 217.5 229.6
818.2
315.7
512 301.5 217.9 217.6 796.2 307.0
1024 297.2 206.1 210.7 783.3 302.6
alexnet on K80
http://dlbench.comp.hkbu.edu.hk/

More Related Content

PDF
STAR CCM GLOBAL CONFERENCE UBERCLOUD
Thomas Francis
 
PDF
JOSA TechTalks - Downgrade your Costs
Jordan Open Source Association
 
PDF
OpenNebula TechDay Boston 2015 - HA HPC with OpenNebula
OpenNebula Project
 
PDF
SCasia 2018 MSFT hands on session for Azure Batch AI
Hiroshi Tanaka
 
PDF
HPC on Azure for Reserach
Jürgen Ambrosi
 
PPTX
FPGAs in the cloud? (October 2017)
Julien SIMON
 
PDF
Scaling MLOps on NVIDIA DGX Systems
cnvrg.io AI OS - Hands-on ML Workshops
 
PPTX
HaaS: HPCC Systems as a Service – BYOD to the Cloud Party
HPCC Systems
 
STAR CCM GLOBAL CONFERENCE UBERCLOUD
Thomas Francis
 
JOSA TechTalks - Downgrade your Costs
Jordan Open Source Association
 
OpenNebula TechDay Boston 2015 - HA HPC with OpenNebula
OpenNebula Project
 
SCasia 2018 MSFT hands on session for Azure Batch AI
Hiroshi Tanaka
 
HPC on Azure for Reserach
Jürgen Ambrosi
 
FPGAs in the cloud? (October 2017)
Julien SIMON
 
Scaling MLOps on NVIDIA DGX Systems
cnvrg.io AI OS - Hands-on ML Workshops
 
HaaS: HPCC Systems as a Service – BYOD to the Cloud Party
HPCC Systems
 

What's hot (15)

PPTX
Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...
Magalix Corporation
 
PDF
Running BSD on AWS
Julien SIMON
 
PPTX
Windows Azure IaaS and Hybrid
Mike Martin
 
PPTX
Amazon Web Services EC2 Basics
Onur ŞALK
 
PPTX
Deep Learning with Apache MXNet (September 2017)
Julien SIMON
 
PPTX
themidgame-tube-slides
Pedro Moy
 
PDF
Cloud hosting survey
Michael Peters
 
PDF
1. CNCF kubernetes meetup - Ondrej Sika
Juraj Hantak
 
PPTX
Java on azure
Anders Lybecker
 
PPTX
Google Compute Engine
Csaba Toth
 
PPTX
Get superior performance with auto scalable e nlight managed cloud
manoharparakh
 
PPTX
Stefano Doni - Achieve Superhuman Performance with Machine Learning
Neotys_Partner
 
PDF
Re invent 2018 meetup presentation
Eliran Yamin
 
PDF
From Rack scale computers to Warehouse scale computers
Ryousei Takano
 
PDF
MizuhoDeploymentProcess
Matt R
 
Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...
Magalix Corporation
 
Running BSD on AWS
Julien SIMON
 
Windows Azure IaaS and Hybrid
Mike Martin
 
Amazon Web Services EC2 Basics
Onur ŞALK
 
Deep Learning with Apache MXNet (September 2017)
Julien SIMON
 
themidgame-tube-slides
Pedro Moy
 
Cloud hosting survey
Michael Peters
 
1. CNCF kubernetes meetup - Ondrej Sika
Juraj Hantak
 
Java on azure
Anders Lybecker
 
Google Compute Engine
Csaba Toth
 
Get superior performance with auto scalable e nlight managed cloud
manoharparakh
 
Stefano Doni - Achieve Superhuman Performance with Machine Learning
Neotys_Partner
 
Re invent 2018 meetup presentation
Eliran Yamin
 
From Rack scale computers to Warehouse scale computers
Ryousei Takano
 
MizuhoDeploymentProcess
Matt R
 
Ad

Similar to Large-Scale AI with Azure Container Service (20)

PDF
Using Deep Learning Toolkits with Kubernetes clusters
Joy Qiao
 
PDF
AI橋渡しクラウド(ABCI)における高性能計算とAI/ビッグデータ処理の融合
Hitoshi Sato
 
PDF
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
NVIDIA Taiwan
 
PDF
GIST AI-X Computing Cluster
Jax Jargalsaikhan
 
PDF
“Parallelizing Machine Learning Applications in the Cloud with Kubernetes: A ...
Edge AI and Vision Alliance
 
PDF
ROS/ROS2 Distributed System with Kubernetes
Tomoya Fujita
 
PDF
Ceph Day Berlin: Scaling an Academic Cloud
Ceph Community
 
PDF
Ceph Day Berlin: Scaling an Academic Cloud
Ceph Community
 
PPTX
Leonid Kuligin "Training ML models with Cloud"
Lviv Startup Club
 
PDF
OSDC 2017 | Something Openshift Kubernetes Containers by Kristian Köhntopp
NETWAYS
 
PDF
Democratizing machine learning on kubernetes
Docker, Inc.
 
PDF
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
Hitoshi Sato
 
PDF
Kubernetes Robotics Edge Cluster System
Tomoya Fujita
 
PPTX
DOE Magellan OpenStack user story
laurabeckcahoon
 
PDF
Kerbernetes Robotics Distributed System Deep Dive
Tomoya Fujita
 
PPTX
OS for AI: Elastic Microservices & the Next Gen of ML
Nordic APIs
 
PDF
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
The Linux Foundation
 
PDF
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Lablup Inc.
 
PPTX
Azure and Deep Learning
David Giard
 
PDF
Compressing of Magnetic Resonance Images with Cuda
ijtsrd
 
Using Deep Learning Toolkits with Kubernetes clusters
Joy Qiao
 
AI橋渡しクラウド(ABCI)における高性能計算とAI/ビッグデータ処理の融合
Hitoshi Sato
 
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
NVIDIA Taiwan
 
GIST AI-X Computing Cluster
Jax Jargalsaikhan
 
“Parallelizing Machine Learning Applications in the Cloud with Kubernetes: A ...
Edge AI and Vision Alliance
 
ROS/ROS2 Distributed System with Kubernetes
Tomoya Fujita
 
Ceph Day Berlin: Scaling an Academic Cloud
Ceph Community
 
Ceph Day Berlin: Scaling an Academic Cloud
Ceph Community
 
Leonid Kuligin "Training ML models with Cloud"
Lviv Startup Club
 
OSDC 2017 | Something Openshift Kubernetes Containers by Kristian Köhntopp
NETWAYS
 
Democratizing machine learning on kubernetes
Docker, Inc.
 
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
Hitoshi Sato
 
Kubernetes Robotics Edge Cluster System
Tomoya Fujita
 
DOE Magellan OpenStack user story
laurabeckcahoon
 
Kerbernetes Robotics Distributed System Deep Dive
Tomoya Fujita
 
OS for AI: Elastic Microservices & the Next Gen of ML
Nordic APIs
 
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
The Linux Foundation
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Lablup Inc.
 
Azure and Deep Learning
David Giard
 
Compressing of Magnetic Resonance Images with Cuda
ijtsrd
 
Ad

Recently uploaded (20)

PPTX
Measurement of Afordability for Water Supply and Sanitation in Bangladesh .pptx
akmibrahimbd
 
PPTX
Presentation (1) (1).pptx k8hhfftuiiigff
karthikjagath2005
 
PPTX
short term internship project on Data visualization
JMJCollegeComputerde
 
PDF
A Systems Thinking Approach to Algorithmic Fairness.pdf
Epistamai
 
PPTX
Blue and Dark Blue Modern Technology Presentation.pptx
ap177979
 
PDF
Linux OS guide to know, operate. Linux Filesystem, command, users and system
Kiran Maharjan
 
PDF
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 
PPTX
Introduction to Biostatistics Presentation.pptx
AtemJoshua
 
PPTX
World-population.pptx fire bunberbpeople
umutunsalnsl4402
 
PPTX
Data-Driven Machine Learning for Rail Infrastructure Health Monitoring
Sione Palu
 
PDF
Mastering Financial Analysis Materials.pdf
SalamiAbdullahi
 
PPTX
Introduction to Data Analytics and Data Science
KavithaCIT
 
PPTX
Future_of_AI_Presentation for everyone.pptx
boranamanju07
 
PPTX
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
PDF
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PPT
Real Life Application of Set theory, Relations and Functions
manavparmar205
 
PDF
oop_java (1) of ice or cse or eee ic.pdf
sabiquntoufiqlabonno
 
PPTX
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 
PPTX
Data Security Breach: Immediate Action Plan
varmabhuvan266
 
PDF
blockchain123456789012345678901234567890
tanvikhunt1003
 
Measurement of Afordability for Water Supply and Sanitation in Bangladesh .pptx
akmibrahimbd
 
Presentation (1) (1).pptx k8hhfftuiiigff
karthikjagath2005
 
short term internship project on Data visualization
JMJCollegeComputerde
 
A Systems Thinking Approach to Algorithmic Fairness.pdf
Epistamai
 
Blue and Dark Blue Modern Technology Presentation.pptx
ap177979
 
Linux OS guide to know, operate. Linux Filesystem, command, users and system
Kiran Maharjan
 
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 
Introduction to Biostatistics Presentation.pptx
AtemJoshua
 
World-population.pptx fire bunberbpeople
umutunsalnsl4402
 
Data-Driven Machine Learning for Rail Infrastructure Health Monitoring
Sione Palu
 
Mastering Financial Analysis Materials.pdf
SalamiAbdullahi
 
Introduction to Data Analytics and Data Science
KavithaCIT
 
Future_of_AI_Presentation for everyone.pptx
boranamanju07
 
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Real Life Application of Set theory, Relations and Functions
manavparmar205
 
oop_java (1) of ice or cse or eee ic.pdf
sabiquntoufiqlabonno
 
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 
Data Security Breach: Immediate Action Plan
varmabhuvan266
 
blockchain123456789012345678901234567890
tanvikhunt1003
 

Large-Scale AI with Azure Container Service

Editor's Notes