Deeplearning4J
François Garillot, @huitseeker
Neural Networks & Deep Learning
• graphical models w/ inputs and outputs
• represents composition of differentiable functions
• deep learning : expressivity exponential w.r.t depth
Interesting results
• cat paper by Andrew Ng & Goole
• AlexNet by Toronto
• last week CNTK at speech
recognition parity with humans
Industrial results
• Autonomous Driving : Drive.ai, Comma.ai + the usual
suspects
• Drugs discovery : Deep Genomics (Frey) & Bayer
• Predictive Maintenance : Thales, Bosch
• optimistic pessimism (Moghimi, Manulife Financial Corp.)
DeepLearning in two steps : training,
applying
• training tends to require lots of data, (R)
• but applying does not (embedded, etc).
So that applying pre-trained models (Tensorframes) not the
technical/business challenge.
Enterprise : have lots of data yourself, what to apply ?
Benchmarks aren't distributed
Training, but how ?
New Amazon GPU instances ?
DeepLearning4J and Spark: Successes and Challenges - François Garillot
Deep Learning Training
• Facebook, Amazon, Google, Baidu, Microsoft have this
distributed
• But what if you’re not one of them ?
Training, but how ?
Distributing training
• basically distributing SGD (R)
• challenge is AllReduce Communication
• Sparse updates, async communications
Deeplearning4J
• the first commercial-grade, open-source, distributed deep-
learning library written for Java and Scala
• Skymind its commercial support arm
Scientific computing on the JVM
• libnd4j : Vectorization, 32-bit addressing, linalg (BLAS!)
• JavaCPP: generates JNI bindings to your CPP libs
• ND4J : numpy for the JVM, native superfast arrays
• Datavec : one-stop interface to an NDArray
• DeepLearning4J: orchestration, backprop, layer definition
• ScalNet: gateway drug, inspired from (and closely following)
Keras
Reinforcement learning
Killing the bottlenecks : generic
• swappable net backend : netty -> aeron (Hi Lightbend !)
• better support for binary data : big indexed tables
Binary, columnar, off-heap
• and more (Tamiya Onodera's group @ IBM Japan):
http://www.slideshare.net/ishizaki/exploiting-gpus-in-spark
And if you don't care about Deep
Learning ?
• Spark-6442 : better linear algebra than breeze, please.
(sparse, performant, Java-compatible, and an OK license)
• SystemML got a best paper at VLDB'16, how about helping
out on nd4j ?
• ND4J only lacks sparse, but not for long ...
Questions ?

More Related Content

PDF
Deep learning on a mixed cluster with deeplearning4j and spark
PDF
DL4J at Workday Meetup
PDF
DeepLearning4J and Spark: Successes and Challenges - François Garillot
PDF
Hands on image recognition with scala spark and deep learning4j
PDF
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
PPTX
Deep Learning: DL4J and DataVec
PPTX
Deep learning on Hadoop/Spark -NextML
PPTX
Deep learning with DL4J - Hadoop Summit 2015
Deep learning on a mixed cluster with deeplearning4j and spark
DL4J at Workday Meetup
DeepLearning4J and Spark: Successes and Challenges - François Garillot
Hands on image recognition with scala spark and deep learning4j
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
Deep Learning: DL4J and DataVec
Deep learning on Hadoop/Spark -NextML
Deep learning with DL4J - Hadoop Summit 2015

What's hot (20)

PPTX
Deep Learning on Qubole Data Platform
PDF
Snorkel: Dark Data and Machine Learning with Christopher Ré
PPTX
Hadoop summit 2016
PPTX
Building Deep Learning Workflows with DL4J
PPTX
Amazon Deep Learning
PDF
Kaz Sato, Evangelist, Google at MLconf ATL 2016
PPTX
Deep Learning Intro - Georgia Tech - CSE6242 - March 2015
PDF
Bringing Deep Learning into production
PPTX
Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16
PDF
Bringing an AI Ecosystem to the Domain Expert and Enterprise AI Developer wit...
PDF
Strata Beijing 2017: Jumpy, a python interface for nd4j
PDF
Apache SystemML - Declarative Large-Scale Machine Learning
PPTX
Boolan machine learning summit
PDF
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
PDF
New Developments in H2O: April 2017 Edition
PDF
Anomaly Detection and Automatic Labeling with Deep Learning
PDF
Advanced deeplearning4j features
PPTX
Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop
ODP
Self driving computers active learning workflows with human interpretable ve...
PDF
Distributed Inference on Large Datasets Using Apache MXNet and Apache Spark ...
Deep Learning on Qubole Data Platform
Snorkel: Dark Data and Machine Learning with Christopher Ré
Hadoop summit 2016
Building Deep Learning Workflows with DL4J
Amazon Deep Learning
Kaz Sato, Evangelist, Google at MLconf ATL 2016
Deep Learning Intro - Georgia Tech - CSE6242 - March 2015
Bringing Deep Learning into production
Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16
Bringing an AI Ecosystem to the Domain Expert and Enterprise AI Developer wit...
Strata Beijing 2017: Jumpy, a python interface for nd4j
Apache SystemML - Declarative Large-Scale Machine Learning
Boolan machine learning summit
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
New Developments in H2O: April 2017 Edition
Anomaly Detection and Automatic Labeling with Deep Learning
Advanced deeplearning4j features
Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop
Self driving computers active learning workflows with human interpretable ve...
Distributed Inference on Large Datasets Using Apache MXNet and Apache Spark ...
Ad

Viewers also liked (20)

PPTX
Sf data mining_meetup
ODP
Deep Learning meetup
PPTX
Ersatz meetup - DeepLearning4j Demo
PPS
¡Ése es Dios!
PPS
Si Alguna Vez Piensas Que Te Falle
PPS
Debo Ser
PDF
Holden Karau - Spark ML for Custom Models
PPTX
Brief introduction to Distributed Deep Learning
PPT
Pascua de Resurrección
PPT
Jesucristo ha resucitado, el vive!
PDF
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
PPTX
Recurrent nets and sensors
ODP
Hubba Deep Learning
PDF
Skymind 深度学习 - T11 Summit
PDF
DeepLearning4J: Open Source Neural Net Platform
PDF
DeepLearning4J and Spark: Successes and Challenges - François Garillot
PDF
Deep Learning on Production with Spark
PDF
Deeplearning on Hadoop @OSCON 2014
PDF
Skymind's Platform - CN
PPTX
Deep Learning using Spark and DL4J for fun and profit
Sf data mining_meetup
Deep Learning meetup
Ersatz meetup - DeepLearning4j Demo
¡Ése es Dios!
Si Alguna Vez Piensas Que Te Falle
Debo Ser
Holden Karau - Spark ML for Custom Models
Brief introduction to Distributed Deep Learning
Pascua de Resurrección
Jesucristo ha resucitado, el vive!
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
Recurrent nets and sensors
Hubba Deep Learning
Skymind 深度学习 - T11 Summit
DeepLearning4J: Open Source Neural Net Platform
DeepLearning4J and Spark: Successes and Challenges - François Garillot
Deep Learning on Production with Spark
Deeplearning on Hadoop @OSCON 2014
Skymind's Platform - CN
Deep Learning using Spark and DL4J for fun and profit
Ad

Similar to DeepLearning4J and Spark: Successes and Challenges - François Garillot (20)

PPTX
Demystifying Machine Learning and Artificial Intelligence
PDF
Deep Learning on Apache® Spark™: Workflows and Best Practices
PDF
Deep Learning on Apache® Spark™ : Workflows and Best Practices
PDF
Deep Learning on Apache® Spark™: Workflows and Best Practices
PPTX
Deep Learning and Recurrent Neural Networks in the Enterprise
PDF
Data Science Accelerator Program
PPTX
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
PDF
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
PDF
Deep Domain
PDF
Big Data Analytics (ML, DL, AI) hands-on
PDF
DeepLearning and Advanced Machine Learning on IoT
PDF
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
PDF
Deep learning in production with the best
PPT
Notes from 2016 bay area deep learning school
PPTX
Deep learning with tensorflow
PPTX
Deep learning for dummies dec 23 2017
PDF
Taming Your Deep Learning Workflow by Determined AI
PDF
Introducing TensorFlow: The game changer in building "intelligent" applications
PPTX
High time to add machine learning to your information security stack
PDF
3 python packages
 
Demystifying Machine Learning and Artificial Intelligence
Deep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning and Recurrent Neural Networks in the Enterprise
Data Science Accelerator Program
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Deep Domain
Big Data Analytics (ML, DL, AI) hands-on
DeepLearning and Advanced Machine Learning on IoT
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
Deep learning in production with the best
Notes from 2016 bay area deep learning school
Deep learning with tensorflow
Deep learning for dummies dec 23 2017
Taming Your Deep Learning Workflow by Determined AI
Introducing TensorFlow: The game changer in building "intelligent" applications
High time to add machine learning to your information security stack
3 python packages
 

More from sparktc (11)

PDF
Apache Spark™ Applications the Easy Way - Pierre Borckmans
PPTX
Hyperparameter Optimization - Sven Hafeneger
PDF
Data Science Hub & the Data Science Community - Philippe Van Impe
PDF
Data Science and Beer - Kris peeters
PDF
Creating an end-to-end Recommender System with Apache Spark and Elasticsearch...
PPTX
Building Custom
Machine Learning Algorithms
with Apache SystemML
PPTX
The Internet of Everywhere — How The Weather Company Scales
PPTX
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
PDF
STC Design - Engage
PPTX
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...
PDF
Spark Summit EU: IBM Keynote
Apache Spark™ Applications the Easy Way - Pierre Borckmans
Hyperparameter Optimization - Sven Hafeneger
Data Science Hub & the Data Science Community - Philippe Van Impe
Data Science and Beer - Kris peeters
Creating an end-to-end Recommender System with Apache Spark and Elasticsearch...
Building Custom
Machine Learning Algorithms
with Apache SystemML
The Internet of Everywhere — How The Weather Company Scales
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
STC Design - Engage
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...
Spark Summit EU: IBM Keynote

Recently uploaded (20)

PDF
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
PDF
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
PDF
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
PDF
Introduction to MCP and A2A Protocols: Enabling Agent Communication
PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PDF
Advancing precision in air quality forecasting through machine learning integ...
PDF
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」
DOCX
Basics of Cloud Computing - Cloud Ecosystem
PDF
A hybrid framework for wild animal classification using fine-tuned DenseNet12...
PDF
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
PDF
Lung cancer patients survival prediction using outlier detection and optimize...
PPTX
MuleSoft-Compete-Deck for midddleware integrations
PDF
Enhancing plagiarism detection using data pre-processing and machine learning...
PDF
Planning-an-Audit-A-How-To-Guide-Checklist-WP.pdf
PPTX
Training Program for knowledge in solar cell and solar industry
PDF
Auditboard EB SOX Playbook 2023 edition.
PDF
Data Virtualization in Action: Scaling APIs and Apps with FME
PDF
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
PPTX
future_of_ai_comprehensive_20250822032121.pptx
PDF
LMS bot: enhanced learning management systems for improved student learning e...
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
Introduction to MCP and A2A Protocols: Enabling Agent Communication
Convolutional neural network based encoder-decoder for efficient real-time ob...
Advancing precision in air quality forecasting through machine learning integ...
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」
Basics of Cloud Computing - Cloud Ecosystem
A hybrid framework for wild animal classification using fine-tuned DenseNet12...
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
Lung cancer patients survival prediction using outlier detection and optimize...
MuleSoft-Compete-Deck for midddleware integrations
Enhancing plagiarism detection using data pre-processing and machine learning...
Planning-an-Audit-A-How-To-Guide-Checklist-WP.pdf
Training Program for knowledge in solar cell and solar industry
Auditboard EB SOX Playbook 2023 edition.
Data Virtualization in Action: Scaling APIs and Apps with FME
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
future_of_ai_comprehensive_20250822032121.pptx
LMS bot: enhanced learning management systems for improved student learning e...

DeepLearning4J and Spark: Successes and Challenges - François Garillot

  • 2. Neural Networks & Deep Learning • graphical models w/ inputs and outputs • represents composition of differentiable functions • deep learning : expressivity exponential w.r.t depth
  • 3. Interesting results • cat paper by Andrew Ng & Goole • AlexNet by Toronto • last week CNTK at speech recognition parity with humans
  • 4. Industrial results • Autonomous Driving : Drive.ai, Comma.ai + the usual suspects • Drugs discovery : Deep Genomics (Frey) & Bayer • Predictive Maintenance : Thales, Bosch • optimistic pessimism (Moghimi, Manulife Financial Corp.)
  • 5. DeepLearning in two steps : training, applying • training tends to require lots of data, (R) • but applying does not (embedded, etc). So that applying pre-trained models (Tensorframes) not the technical/business challenge. Enterprise : have lots of data yourself, what to apply ?
  • 7. Training, but how ? New Amazon GPU instances ?
  • 9. Deep Learning Training • Facebook, Amazon, Google, Baidu, Microsoft have this distributed • But what if you’re not one of them ?
  • 11. Distributing training • basically distributing SGD (R) • challenge is AllReduce Communication • Sparse updates, async communications
  • 12. Deeplearning4J • the first commercial-grade, open-source, distributed deep- learning library written for Java and Scala • Skymind its commercial support arm
  • 13. Scientific computing on the JVM • libnd4j : Vectorization, 32-bit addressing, linalg (BLAS!) • JavaCPP: generates JNI bindings to your CPP libs • ND4J : numpy for the JVM, native superfast arrays • Datavec : one-stop interface to an NDArray • DeepLearning4J: orchestration, backprop, layer definition • ScalNet: gateway drug, inspired from (and closely following) Keras
  • 15. Killing the bottlenecks : generic • swappable net backend : netty -> aeron (Hi Lightbend !) • better support for binary data : big indexed tables Binary, columnar, off-heap • and more (Tamiya Onodera's group @ IBM Japan): http://www.slideshare.net/ishizaki/exploiting-gpus-in-spark
  • 16. And if you don't care about Deep Learning ? • Spark-6442 : better linear algebra than breeze, please. (sparse, performant, Java-compatible, and an OK license) • SystemML got a best paper at VLDB'16, how about helping out on nd4j ? • ND4J only lacks sparse, but not for long ...