SlideShare a Scribd company logo
© Tally Solutions Pvt. Ltd. All Rights Reserved
Distributed Deep Learning
Framework over Spark
Dr. Vijay Srinivas Agneeswaran,
Director and Head, Data Sciences,
Tally Analytics Pvt. Ltd.
Bangalore, India and
Sai Sagar,
Software Engineer,
Impetus Infotech India Pvt. Ltd.
© Tally Solutions Pvt. Ltd. All Rights Reserved 22
Contents
Basics of Artificial Neural Networks
Introduction
DLNs for Face Recognition, Different kinds
of deep layered networks
Deep Layered
Networks
Success stories and applications of DLNs
DLN
Applications
Challenges in Realizing Distributed DLNs,
our Spark based Distributed DLN Framework
Distributed
DLNs
Audio Sentiment Analysis
Proof of
Concept
© Tally Solutions Pvt. Ltd. All Rights Reserved 33
Introduction to Artificial Neural Networks
(ANNs) Perceptron
© Tally Solutions Pvt. Ltd. All Rights Reserved 44
Introduction to Artificial Neural Networks
(ANNs) Sigmoid Neuron
• Small change in input = small change in behaviour.
• Output of a sigmoid neuron is given below:
• Small change in input = small change in behaviour.
• Output of a sigmoid neuron is given below:
© Tally Solutions Pvt. Ltd. All Rights Reserved 55
Introduction to ANNs: Back Propagation
http://zerkpage.tripod.com/ann.htm
What is this?
NAND Gate!
initialize network weights (often small random values)
do forEach training example ex
prediction = neural-net-output(network, ex) // forward pass
actual = teacher-output(ex)
compute error (prediction - actual) at the output units
compute delta(wh)for all weights from hidden layer to output layer //
backward pass
compute delta(wi) for all weights from input layer to hidden layer
// backward pass continued
update network weights until all examples classified correctly or
another stopping criterion satisfied
return the network
© Tally Solutions Pvt. Ltd. All Rights Reserved 66
The network to identify the individual digits
from the input image
http://neuralnetworksanddeeplearning.com/chap1.html
© Tally Solutions Pvt. Ltd. All Rights Reserved 77
Deep Layered Networks (DLNs) for Face
Recognition
© Tally Solutions Pvt. Ltd. All Rights Reserved 88
DLN for Face Recognition
http://www.slideshare.net/hammawan/deep-neural-networks
© Tally Solutions Pvt. Ltd. All Rights Reserved 99
Deep Learning Networks: Learning
No general
learning
algorithm
(No-free-
lunch
theorem by
Wolpert
1996).
Learning
algorithm
for specific
tasks
Limitatio
ns of BP
Hinton’s
deep
belief
networks
as stack
of
RBMs.
Lecun’s
energy
based
learning
for DBNs.
© Tally Solutions Pvt. Ltd. All Rights Reserved 1010
• This is a deep neural network
composed of multiple layers of
latent variables (hidden units or
feature detectors)
• Can be viewed as a stack of RBMs
• Hinton along with his student
proposed that these networks can
be trained greedily one layer at a
time
Deep Belief Networks
http://www.iro.umontreal.ca/~lisa/twiki/pub/Public/DeepBeliefNetworks/DBNs.png
• Boltzmann Machine is a
specific energy model with
linear energy function.
© Tally Solutions Pvt. Ltd. All Rights Reserved 1111
• Aim of auto encoders network is to learn a
compressed representation for set of data
• Is an unsupervised learning algorithm that
applies back propagation, setting the target
values equal to inputs (identity function)
• Denoising auto encoder addresses identity
function by randomly corrupting input that
the auto encoder must then reconstruct or
denoise
• Best applied when there is structure in the
data
• Applications : Dimensionality reduction,
feature selection
Other DL Networks: Auto Encoders (Auto-
associators or Diabolo Network)
© Tally Solutions Pvt. Ltd. All Rights Reserved 1212
Why Deep Learning Networks are Brain-like?
Statistical approach
of traditional ML –
SVMs or kernel
approaches.
• Not applicable in
deep learning
networks.
Human
brain –
trophic
factors
Traditional ML – lot
of data munging,
representational
issues (feature
abstractor), before
classifier can kick
in.
Deep learning
– allows the
system to learn
representations
as well
naturally.
© Tally Solutions Pvt. Ltd. All Rights Reserved 1313
Copyright @Impetus Technologies, 2014
Success stories of DLNs
Android voice recognition
system – based on DLNs
Improves accuracy by 25%
compared to state-of-art
Microsoft Skype Translate software
and Digital assistant Cortana
1.2 million images, 1000
classes (ImageNet Data) –
error rate of 15.3%, better
than state of art at 26.1%
© Tally Solutions Pvt. Ltd. All Rights Reserved 1414
Success stories of DLNs…..
Senna system – PoS tagging, chunking, NER, semantic role
labeling, syntactic parsing
Comparable F1 score with state-of-art with huge speed
advantage (5 days VS few hours).
DLNs VS TF-IDF: 1 million
documents, relevance search. 3.2ms VS
1.2s.
Robot navigation
© Tally Solutions Pvt. Ltd. All Rights Reserved 1515
Potential Applications of DLNs
Speech recognition/enhancement
Video sequencing
Emotion recognition (video/audio),
Malware detection,
Robotics – navigation.
multi-modal learning (text and image).
Natural Language Processing
© Tally Solutions Pvt. Ltd. All Rights Reserved 1616
Challenges in Realizing DLNs
Large no. of training
examples – high
accuracy.
• Large no. of
parameters can also
improve accuracy.
Inherently sequential
nature – freeze up one
layer for learning.
GPUs to improve
training speedup
• Limitations –
CPU_to_GPU data
transfers.
Distributed DLNs –
Jeffrey Dean’s work.
© Tally Solutions Pvt. Ltd. All Rights Reserved 1717
© Tally Solutions Pvt. Ltd. All Rights Reserved 1818
WiP: Proof of Concept
• Sentiment analysis of continuous speech data
• Stacking RBMs to make a deep belief network.
– First a GRBM (Gaussian RBM) is trained to model a window of frames of
real-valued acoustic coefficients.
– Then the states of the binary hidden units of the GRBM are used as data
for training an RBM.
– This is repeated to create as many hidden layers as desired.
– Then the stack of RBMs is converted to a single generative model, a
DBN, by replacing the undirected connections of the lower level RBMs by
top-down, directed connections.
– Finally, a pre-trained DBN-DNN is created by adding a “softmax” output
layer that contains one unit for each possible state of each HMM. The
DBN-DNN is then discriminatively trained to predict the HMM state
corresponding to the central frame of the input window in a forced
alignment
© Tally Solutions Pvt. Ltd. All Rights Reserved 1919
• ANN to Distributed Deep Learning
• Key ideas in deep learning
• Need for distributed realizations.
• DistBelief, deeplearning4j etc.
• Our work on large scale distributed deep learning
• Deep learning leads us from statistics based machine
learning towards brain inspired AI.
Conclusions
© Tally Solutions Pvt. Ltd. All Rights Reserved 2020
• Tally
• Accounting/business software – widely used in SME.
• 100 million customers worldwide.
• Tally Analytics is a new startup
• Trying to create value from the business data of Tally.
• Supply chain – use of AI in inventory prediction, creating
value in supply chain data.
• What is sold where, when and at what price. All pervading
data?
• We are hiring. Send CVs to vijay.srinivas@tallysolutions.com.
Current Work
© Tally Solutions Pvt. Ltd. All Rights Reserved 2121
Thank You!
Contact Details:
Twitter: a_vijaysrinivas
LinkedIn (Please write an introductory note before connecting):
https://in.linkedin.com/in/vijaysrinivasagneeswaran
Email: vijay.srinivas@tallysolutions.com
© Tally Solutions Pvt. Ltd. All Rights Reserved 2222
Copyright @Impetus Technologies, 2014
• RBM are Energy Based Models (EBM)
• EBM associate an energy with every configuration of a system
• Learning corresponds to modifying the shape of energy
function, so that it has desirable properties
• Like in physics, lower energy = more stability
• So, modify shape of energy function such that the desirable
configurations have lower energy
Energy Based Models
http://www.cs.nyu.edu/~yann/research/ebm/loss-func.png
© Tally Solutions Pvt. Ltd. All Rights Reserved 2323
Other DL networks: Convolutional
Networks
Yann LeCun, Patrick Haffner, Léon Bottou, and Yoshua Bengio. 1999. Object Recognition with Gradient-Based Learning.
In Shape, Contour and Grouping in Computer Vision, David A. Forsyth, Joseph L. Mundy, Vito Di Gesù, and Roberto
Cipolla (Eds.). Springer-Verlag, London, UK, UK, 319-.
© Tally Solutions Pvt. Ltd. All Rights Reserved 2424
• Recurrent Neural networks
• Long Short Term Memory (LSTM), Temporal data
• Sum-product networks
• Deep architectures of sum-product networks
• Hierarchical temporal memory
• online structural and algorithmic model of neocortex.
Other Brain-like Approaches
© Tally Solutions Pvt. Ltd. All Rights Reserved 2525
• Connections between units form a Directed cycle i.e. a
typical feed back connections
• RNNs can use their internal memory to process
arbitrary sequences of inputs
• RNNs cannot learn to look far back past
• LSTM solve this problem by introducing stem cells
• These stem cells can remember a value for an arbitrary
amount of time
Recurrent Neural Networks
© Tally Solutions Pvt. Ltd. All Rights Reserved 2626
• SPN is deep network model and is a directed acyclic
graph
• These networks allow to compute the probability of an
event quickly
• SPNs try to convert multi linear functions to ones in
computationally short forms i.e. it must consist of
multiple additions and multiplications
• Leaves correspond to variables and nodes correspond
to sums and products
Sum-Product Networks (SPN)
© Tally Solutions Pvt. Ltd. All Rights Reserved 2727
• Is a online machine learning model developed by Jeff
Hawkins
• This model learns one instance at a time
• Best explained by online stock model. Today’s situation
of stock helps in prediction of tomorrow’s stock
• A HTM network is tree shaped hierarchy of levels
• Higher hierarchy levels can use patterns learned at lower
levels. This is adopted from learning model adopted by
brain in the form of neo cortex
Hierarchical Temporal Memory
© Tally Solutions Pvt. Ltd. All Rights Reserved 2828
http://en.wikipedia.org/wiki/Hierarchical_temporal_memory
© Tally Solutions Pvt. Ltd. All Rights Reserved 2929
Mathematical Equations
• The Energy Function is defined as follows:
b’ and c’ are the biases
𝐸 𝑥, ℎ = −𝑏′ 𝑥 − 𝑐′ℎ − ℎ′ 𝑊𝑥
where, W represents the weights connecting
visible layer and hidden layer.
© Tally Solutions Pvt. Ltd. All Rights Reserved 3030
Learning Energy Based Models
• Energy based models can be learnt by performing gradient descent on
negative log-likelihood of training data
• It has the following form:
−
𝜕 log 𝑝 𝑥
𝜕θ
=
𝜕 𝐹 𝑥
𝜕θ
−
𝑥̃
𝑝 𝑥
𝜕 𝐹 𝑥
𝜕θ
Positive phase Negative phase

More Related Content

PPTX
Distributed computing abstractions_data_science_6_june_2016_ver_0.4
Vijay Srinivas Agneeswaran, Ph.D
 
PPTX
Open problems big_data_19_feb_2015_ver_0.1
Vijay Srinivas Agneeswaran, Ph.D
 
PPTX
Big data analytics_7_giants_public_24_sep_2013
Vijay Srinivas Agneeswaran, Ph.D
 
PPTX
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Vijay Srinivas Agneeswaran, Ph.D
 
PPTX
Chug dl presentation
Chicago Hadoop Users Group
 
PPTX
Keras: A versatile modeling layer for deep learning
Dr. Ananth Krishnamoorthy
 
PDF
Graph Databases and Machine Learning | November 2018
TigerGraph
 
PDF
Predictive Maintenance Using Recurrent Neural Networks
Justin Brandenburg
 
Distributed computing abstractions_data_science_6_june_2016_ver_0.4
Vijay Srinivas Agneeswaran, Ph.D
 
Open problems big_data_19_feb_2015_ver_0.1
Vijay Srinivas Agneeswaran, Ph.D
 
Big data analytics_7_giants_public_24_sep_2013
Vijay Srinivas Agneeswaran, Ph.D
 
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Vijay Srinivas Agneeswaran, Ph.D
 
Chug dl presentation
Chicago Hadoop Users Group
 
Keras: A versatile modeling layer for deep learning
Dr. Ananth Krishnamoorthy
 
Graph Databases and Machine Learning | November 2018
TigerGraph
 
Predictive Maintenance Using Recurrent Neural Networks
Justin Brandenburg
 

What's hot (20)

PDF
Graph Gurus Episode 1: Enterprise Graph
TigerGraph
 
PDF
Leveraging NLP and Deep Learning for Document Recommendations in the Cloud
Databricks
 
PPTX
Comparing Big Data and Simulation Applications and Implications for Software ...
Geoffrey Fox
 
PDF
Aplicações Potenciais de Deep Learning à Indústria do Petróleo
Grupo de Geofísica Computacional, UNICAMP
 
PPTX
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
Mathieu Dumoulin
 
PPTX
Deep learning at nmc devin jones
Ido Shilon
 
PDF
Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄
Cheer Chain Enterprise Co., Ltd.
 
PPTX
The elephantintheroom bigdataanalyticsinthecloud
Khazret Sapenov
 
PDF
Plume - A Code Property Graph Extraction and Analysis Library
TigerGraph
 
PDF
II-SDV 2017: The Next Era: Deep Learning for Biomedical Research
Dr. Haxel Consult
 
PPTX
Python for Data Science with Anaconda
Travis Oliphant
 
PDF
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Matej Misik
 
PPTX
Big Data HPC Convergence
Geoffrey Fox
 
PPTX
Video Analytics on Hadoop webinar victor fang-201309
DrVictorFang
 
PPTX
Graph Data: a New Data Management Frontier
Demai Ni
 
PPTX
Big Data Analysis in Hydrogen Station using Spark and Azure ML
Jongwook Woo
 
PDF
useR 2014 jskim
Jinseob Kim
 
PDF
Graph Gurus Episode 6: Community Detection
TigerGraph
 
PPTX
Shikha fdp 62_14july2017
Dr. Shikha Mehta
 
PDF
Perspective on HPC-enabled AI
inside-BigData.com
 
Graph Gurus Episode 1: Enterprise Graph
TigerGraph
 
Leveraging NLP and Deep Learning for Document Recommendations in the Cloud
Databricks
 
Comparing Big Data and Simulation Applications and Implications for Software ...
Geoffrey Fox
 
Aplicações Potenciais de Deep Learning à Indústria do Petróleo
Grupo de Geofísica Computacional, UNICAMP
 
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
Mathieu Dumoulin
 
Deep learning at nmc devin jones
Ido Shilon
 
Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄
Cheer Chain Enterprise Co., Ltd.
 
The elephantintheroom bigdataanalyticsinthecloud
Khazret Sapenov
 
Plume - A Code Property Graph Extraction and Analysis Library
TigerGraph
 
II-SDV 2017: The Next Era: Deep Learning for Biomedical Research
Dr. Haxel Consult
 
Python for Data Science with Anaconda
Travis Oliphant
 
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Matej Misik
 
Big Data HPC Convergence
Geoffrey Fox
 
Video Analytics on Hadoop webinar victor fang-201309
DrVictorFang
 
Graph Data: a New Data Management Frontier
Demai Ni
 
Big Data Analysis in Hydrogen Station using Spark and Azure ML
Jongwook Woo
 
useR 2014 jskim
Jinseob Kim
 
Graph Gurus Episode 6: Community Detection
TigerGraph
 
Shikha fdp 62_14july2017
Dr. Shikha Mehta
 
Perspective on HPC-enabled AI
inside-BigData.com
 
Ad

Similar to Distributed deep learning_framework_spark_4_may_2015_ver_0.7 (20)

PDF
Deep learning - Conceptual understanding and applications
Buhwan Jeong
 
PDF
Big Data Malaysia - A Primer on Deep Learning
Poo Kuan Hoong
 
PDF
DSRLab seminar Introduction to deep learning
Poo Kuan Hoong
 
PPTX
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Impetus Technologies
 
PDF
An Introduction to Deep Learning
Poo Kuan Hoong
 
PDF
Looking into the Black Box - A Theoretical Insight into Deep Learning Networks
Dinesh V
 
PDF
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
Poo Kuan Hoong
 
PPT
Deep learning is a subset of machine learning and AI
leradiophysicien1
 
PPT
deepnet-lourentzou.ppt
yang947066
 
PPT
Overview of Deep Learning and its advantage
aqib296675
 
PPT
Introduction to Deep Learning presentation
johanericka2
 
PPT
deeplearning
huda2018
 
PDF
Deep Learning: Application & Opportunity
iTrain
 
PPTX
Seminar Presentation on AI Learning.pptx
jsandyal13
 
PPTX
Introduction to Deep learning
Massimiliano Patacchiola
 
PDF
Deep Learning, an interactive introduction for NLP-ers
Roelof Pieters
 
PDF
Introduction to parallel iterative deep learning on hadoop’s next​ generation...
Anh Le
 
PPTX
Computer vision lab seminar(deep learning) yong hoon
Yonghoon Kwon
 
PPTX
AD3501_Deep_Learning_PRAISE_updated.pptx
jeevamahalakshmi
 
PPTX
Deep Learning Tutorial
Amr Rashed
 
Deep learning - Conceptual understanding and applications
Buhwan Jeong
 
Big Data Malaysia - A Primer on Deep Learning
Poo Kuan Hoong
 
DSRLab seminar Introduction to deep learning
Poo Kuan Hoong
 
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Impetus Technologies
 
An Introduction to Deep Learning
Poo Kuan Hoong
 
Looking into the Black Box - A Theoretical Insight into Deep Learning Networks
Dinesh V
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
Poo Kuan Hoong
 
Deep learning is a subset of machine learning and AI
leradiophysicien1
 
deepnet-lourentzou.ppt
yang947066
 
Overview of Deep Learning and its advantage
aqib296675
 
Introduction to Deep Learning presentation
johanericka2
 
deeplearning
huda2018
 
Deep Learning: Application & Opportunity
iTrain
 
Seminar Presentation on AI Learning.pptx
jsandyal13
 
Introduction to Deep learning
Massimiliano Patacchiola
 
Deep Learning, an interactive introduction for NLP-ers
Roelof Pieters
 
Introduction to parallel iterative deep learning on hadoop’s next​ generation...
Anh Le
 
Computer vision lab seminar(deep learning) yong hoon
Yonghoon Kwon
 
AD3501_Deep_Learning_PRAISE_updated.pptx
jeevamahalakshmi
 
Deep Learning Tutorial
Amr Rashed
 
Ad

More from Vijay Srinivas Agneeswaran, Ph.D (6)

PDF
Dl surface statistical_regularities_vs_high_level_concepts_draft_v0.1
Vijay Srinivas Agneeswaran, Ph.D
 
PPTX
Distributed Deep Learning + others for Spark Meetup
Vijay Srinivas Agneeswaran, Ph.D
 
PPTX
Yarn spark next_gen_hadoop_8_jan_2014
Vijay Srinivas Agneeswaran, Ph.D
 
PPTX
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
Vijay Srinivas Agneeswaran, Ph.D
 
PPTX
Big data analytics_beyond_hadoop_public_18_july_2013
Vijay Srinivas Agneeswaran, Ph.D
 
PPTX
Big dataanalyticsbeyondhadoop public_20_june_2013
Vijay Srinivas Agneeswaran, Ph.D
 
Dl surface statistical_regularities_vs_high_level_concepts_draft_v0.1
Vijay Srinivas Agneeswaran, Ph.D
 
Distributed Deep Learning + others for Spark Meetup
Vijay Srinivas Agneeswaran, Ph.D
 
Yarn spark next_gen_hadoop_8_jan_2014
Vijay Srinivas Agneeswaran, Ph.D
 
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
Vijay Srinivas Agneeswaran, Ph.D
 
Big data analytics_beyond_hadoop_public_18_july_2013
Vijay Srinivas Agneeswaran, Ph.D
 
Big dataanalyticsbeyondhadoop public_20_june_2013
Vijay Srinivas Agneeswaran, Ph.D
 

Recently uploaded (20)

PPTX
Presentation (1) (1).pptx k8hhfftuiiigff
karthikjagath2005
 
PPT
2009worlddatasheet_presentation.ppt peoole
umutunsalnsl4402
 
PDF
Chad Readey - An Independent Thinker
Chad Readey
 
PPTX
Short term internship project report on power Bi
JMJCollegeComputerde
 
PPTX
Databricks-DE-Associate Certification Questions-june-2024.pptx
pedelli41
 
PDF
Technical Writing Module-I Complete Notes.pdf
VedprakashArya13
 
PDF
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 
PPTX
Economic Sector Performance Recovery.pptx
yulisbaso2020
 
PDF
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
PPT
Real Life Application of Set theory, Relations and Functions
manavparmar205
 
PPTX
1intro to AI.pptx AI components & composition
ssuserb993e5
 
PPTX
Complete_STATA_Introduction_Beginner.pptx
mbayekebe
 
PDF
Linux OS guide to know, operate. Linux Filesystem, command, users and system
Kiran Maharjan
 
PPTX
Web dev -ppt that helps us understand web technology
shubhragoyal12
 
PPTX
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
PDF
Blue Futuristic Cyber Security Presentation.pdf
tanvikhunt1003
 
PPT
Grade 5 PPT_Science_Q2_W6_Methods of reproduction.ppt
AaronBaluyut
 
PDF
A Systems Thinking Approach to Algorithmic Fairness.pdf
Epistamai
 
PDF
717629748-Databricks-Certified-Data-Engineer-Professional-Dumps-by-Ball-21-03...
pedelli41
 
PDF
Company Presentation pada Perusahaan ADB.pdf
didikfahmi
 
Presentation (1) (1).pptx k8hhfftuiiigff
karthikjagath2005
 
2009worlddatasheet_presentation.ppt peoole
umutunsalnsl4402
 
Chad Readey - An Independent Thinker
Chad Readey
 
Short term internship project report on power Bi
JMJCollegeComputerde
 
Databricks-DE-Associate Certification Questions-june-2024.pptx
pedelli41
 
Technical Writing Module-I Complete Notes.pdf
VedprakashArya13
 
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 
Economic Sector Performance Recovery.pptx
yulisbaso2020
 
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
Real Life Application of Set theory, Relations and Functions
manavparmar205
 
1intro to AI.pptx AI components & composition
ssuserb993e5
 
Complete_STATA_Introduction_Beginner.pptx
mbayekebe
 
Linux OS guide to know, operate. Linux Filesystem, command, users and system
Kiran Maharjan
 
Web dev -ppt that helps us understand web technology
shubhragoyal12
 
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
Blue Futuristic Cyber Security Presentation.pdf
tanvikhunt1003
 
Grade 5 PPT_Science_Q2_W6_Methods of reproduction.ppt
AaronBaluyut
 
A Systems Thinking Approach to Algorithmic Fairness.pdf
Epistamai
 
717629748-Databricks-Certified-Data-Engineer-Professional-Dumps-by-Ball-21-03...
pedelli41
 
Company Presentation pada Perusahaan ADB.pdf
didikfahmi
 

Distributed deep learning_framework_spark_4_may_2015_ver_0.7

  • 1. © Tally Solutions Pvt. Ltd. All Rights Reserved Distributed Deep Learning Framework over Spark Dr. Vijay Srinivas Agneeswaran, Director and Head, Data Sciences, Tally Analytics Pvt. Ltd. Bangalore, India and Sai Sagar, Software Engineer, Impetus Infotech India Pvt. Ltd.
  • 2. © Tally Solutions Pvt. Ltd. All Rights Reserved 22 Contents Basics of Artificial Neural Networks Introduction DLNs for Face Recognition, Different kinds of deep layered networks Deep Layered Networks Success stories and applications of DLNs DLN Applications Challenges in Realizing Distributed DLNs, our Spark based Distributed DLN Framework Distributed DLNs Audio Sentiment Analysis Proof of Concept
  • 3. © Tally Solutions Pvt. Ltd. All Rights Reserved 33 Introduction to Artificial Neural Networks (ANNs) Perceptron
  • 4. © Tally Solutions Pvt. Ltd. All Rights Reserved 44 Introduction to Artificial Neural Networks (ANNs) Sigmoid Neuron • Small change in input = small change in behaviour. • Output of a sigmoid neuron is given below: • Small change in input = small change in behaviour. • Output of a sigmoid neuron is given below:
  • 5. © Tally Solutions Pvt. Ltd. All Rights Reserved 55 Introduction to ANNs: Back Propagation http://zerkpage.tripod.com/ann.htm What is this? NAND Gate! initialize network weights (often small random values) do forEach training example ex prediction = neural-net-output(network, ex) // forward pass actual = teacher-output(ex) compute error (prediction - actual) at the output units compute delta(wh)for all weights from hidden layer to output layer // backward pass compute delta(wi) for all weights from input layer to hidden layer // backward pass continued update network weights until all examples classified correctly or another stopping criterion satisfied return the network
  • 6. © Tally Solutions Pvt. Ltd. All Rights Reserved 66 The network to identify the individual digits from the input image http://neuralnetworksanddeeplearning.com/chap1.html
  • 7. © Tally Solutions Pvt. Ltd. All Rights Reserved 77 Deep Layered Networks (DLNs) for Face Recognition
  • 8. © Tally Solutions Pvt. Ltd. All Rights Reserved 88 DLN for Face Recognition http://www.slideshare.net/hammawan/deep-neural-networks
  • 9. © Tally Solutions Pvt. Ltd. All Rights Reserved 99 Deep Learning Networks: Learning No general learning algorithm (No-free- lunch theorem by Wolpert 1996). Learning algorithm for specific tasks Limitatio ns of BP Hinton’s deep belief networks as stack of RBMs. Lecun’s energy based learning for DBNs.
  • 10. © Tally Solutions Pvt. Ltd. All Rights Reserved 1010 • This is a deep neural network composed of multiple layers of latent variables (hidden units or feature detectors) • Can be viewed as a stack of RBMs • Hinton along with his student proposed that these networks can be trained greedily one layer at a time Deep Belief Networks http://www.iro.umontreal.ca/~lisa/twiki/pub/Public/DeepBeliefNetworks/DBNs.png • Boltzmann Machine is a specific energy model with linear energy function.
  • 11. © Tally Solutions Pvt. Ltd. All Rights Reserved 1111 • Aim of auto encoders network is to learn a compressed representation for set of data • Is an unsupervised learning algorithm that applies back propagation, setting the target values equal to inputs (identity function) • Denoising auto encoder addresses identity function by randomly corrupting input that the auto encoder must then reconstruct or denoise • Best applied when there is structure in the data • Applications : Dimensionality reduction, feature selection Other DL Networks: Auto Encoders (Auto- associators or Diabolo Network)
  • 12. © Tally Solutions Pvt. Ltd. All Rights Reserved 1212 Why Deep Learning Networks are Brain-like? Statistical approach of traditional ML – SVMs or kernel approaches. • Not applicable in deep learning networks. Human brain – trophic factors Traditional ML – lot of data munging, representational issues (feature abstractor), before classifier can kick in. Deep learning – allows the system to learn representations as well naturally.
  • 13. © Tally Solutions Pvt. Ltd. All Rights Reserved 1313 Copyright @Impetus Technologies, 2014 Success stories of DLNs Android voice recognition system – based on DLNs Improves accuracy by 25% compared to state-of-art Microsoft Skype Translate software and Digital assistant Cortana 1.2 million images, 1000 classes (ImageNet Data) – error rate of 15.3%, better than state of art at 26.1%
  • 14. © Tally Solutions Pvt. Ltd. All Rights Reserved 1414 Success stories of DLNs….. Senna system – PoS tagging, chunking, NER, semantic role labeling, syntactic parsing Comparable F1 score with state-of-art with huge speed advantage (5 days VS few hours). DLNs VS TF-IDF: 1 million documents, relevance search. 3.2ms VS 1.2s. Robot navigation
  • 15. © Tally Solutions Pvt. Ltd. All Rights Reserved 1515 Potential Applications of DLNs Speech recognition/enhancement Video sequencing Emotion recognition (video/audio), Malware detection, Robotics – navigation. multi-modal learning (text and image). Natural Language Processing
  • 16. © Tally Solutions Pvt. Ltd. All Rights Reserved 1616 Challenges in Realizing DLNs Large no. of training examples – high accuracy. • Large no. of parameters can also improve accuracy. Inherently sequential nature – freeze up one layer for learning. GPUs to improve training speedup • Limitations – CPU_to_GPU data transfers. Distributed DLNs – Jeffrey Dean’s work.
  • 17. © Tally Solutions Pvt. Ltd. All Rights Reserved 1717
  • 18. © Tally Solutions Pvt. Ltd. All Rights Reserved 1818 WiP: Proof of Concept • Sentiment analysis of continuous speech data • Stacking RBMs to make a deep belief network. – First a GRBM (Gaussian RBM) is trained to model a window of frames of real-valued acoustic coefficients. – Then the states of the binary hidden units of the GRBM are used as data for training an RBM. – This is repeated to create as many hidden layers as desired. – Then the stack of RBMs is converted to a single generative model, a DBN, by replacing the undirected connections of the lower level RBMs by top-down, directed connections. – Finally, a pre-trained DBN-DNN is created by adding a “softmax” output layer that contains one unit for each possible state of each HMM. The DBN-DNN is then discriminatively trained to predict the HMM state corresponding to the central frame of the input window in a forced alignment
  • 19. © Tally Solutions Pvt. Ltd. All Rights Reserved 1919 • ANN to Distributed Deep Learning • Key ideas in deep learning • Need for distributed realizations. • DistBelief, deeplearning4j etc. • Our work on large scale distributed deep learning • Deep learning leads us from statistics based machine learning towards brain inspired AI. Conclusions
  • 20. © Tally Solutions Pvt. Ltd. All Rights Reserved 2020 • Tally • Accounting/business software – widely used in SME. • 100 million customers worldwide. • Tally Analytics is a new startup • Trying to create value from the business data of Tally. • Supply chain – use of AI in inventory prediction, creating value in supply chain data. • What is sold where, when and at what price. All pervading data? • We are hiring. Send CVs to [email protected]. Current Work
  • 21. © Tally Solutions Pvt. Ltd. All Rights Reserved 2121 Thank You! Contact Details: Twitter: a_vijaysrinivas LinkedIn (Please write an introductory note before connecting): https://in.linkedin.com/in/vijaysrinivasagneeswaran Email: [email protected]
  • 22. © Tally Solutions Pvt. Ltd. All Rights Reserved 2222 Copyright @Impetus Technologies, 2014 • RBM are Energy Based Models (EBM) • EBM associate an energy with every configuration of a system • Learning corresponds to modifying the shape of energy function, so that it has desirable properties • Like in physics, lower energy = more stability • So, modify shape of energy function such that the desirable configurations have lower energy Energy Based Models http://www.cs.nyu.edu/~yann/research/ebm/loss-func.png
  • 23. © Tally Solutions Pvt. Ltd. All Rights Reserved 2323 Other DL networks: Convolutional Networks Yann LeCun, Patrick Haffner, Léon Bottou, and Yoshua Bengio. 1999. Object Recognition with Gradient-Based Learning. In Shape, Contour and Grouping in Computer Vision, David A. Forsyth, Joseph L. Mundy, Vito Di Gesù, and Roberto Cipolla (Eds.). Springer-Verlag, London, UK, UK, 319-.
  • 24. © Tally Solutions Pvt. Ltd. All Rights Reserved 2424 • Recurrent Neural networks • Long Short Term Memory (LSTM), Temporal data • Sum-product networks • Deep architectures of sum-product networks • Hierarchical temporal memory • online structural and algorithmic model of neocortex. Other Brain-like Approaches
  • 25. © Tally Solutions Pvt. Ltd. All Rights Reserved 2525 • Connections between units form a Directed cycle i.e. a typical feed back connections • RNNs can use their internal memory to process arbitrary sequences of inputs • RNNs cannot learn to look far back past • LSTM solve this problem by introducing stem cells • These stem cells can remember a value for an arbitrary amount of time Recurrent Neural Networks
  • 26. © Tally Solutions Pvt. Ltd. All Rights Reserved 2626 • SPN is deep network model and is a directed acyclic graph • These networks allow to compute the probability of an event quickly • SPNs try to convert multi linear functions to ones in computationally short forms i.e. it must consist of multiple additions and multiplications • Leaves correspond to variables and nodes correspond to sums and products Sum-Product Networks (SPN)
  • 27. © Tally Solutions Pvt. Ltd. All Rights Reserved 2727 • Is a online machine learning model developed by Jeff Hawkins • This model learns one instance at a time • Best explained by online stock model. Today’s situation of stock helps in prediction of tomorrow’s stock • A HTM network is tree shaped hierarchy of levels • Higher hierarchy levels can use patterns learned at lower levels. This is adopted from learning model adopted by brain in the form of neo cortex Hierarchical Temporal Memory
  • 28. © Tally Solutions Pvt. Ltd. All Rights Reserved 2828 http://en.wikipedia.org/wiki/Hierarchical_temporal_memory
  • 29. © Tally Solutions Pvt. Ltd. All Rights Reserved 2929 Mathematical Equations • The Energy Function is defined as follows: b’ and c’ are the biases 𝐸 𝑥, ℎ = −𝑏′ 𝑥 − 𝑐′ℎ − ℎ′ 𝑊𝑥 where, W represents the weights connecting visible layer and hidden layer.
  • 30. © Tally Solutions Pvt. Ltd. All Rights Reserved 3030 Learning Energy Based Models • Energy based models can be learnt by performing gradient descent on negative log-likelihood of training data • It has the following form: − 𝜕 log 𝑝 𝑥 𝜕θ = 𝜕 𝐹 𝑥 𝜕θ − 𝑥̃ 𝑝 𝑥 𝜕 𝐹 𝑥 𝜕θ Positive phase Negative phase

Editor's Notes

  • #7: Reference : http://neuralnetworksanddeeplearning.com/chap1.html Consider the problem to identify the individual digits from the input image Each image 28 by 28 pixel image. Then network is designed as follows Input layer (image) -> 28*28 = 784 neurons. Each neuron corresponds to a pixel The output layer can be identified by the number of digits to be identified i.e. 10 (0 to 9) The intermediate hidden layer can be experimented with varied number of neurons. Let us fix at 10 nodes in hidden layer
  • #8: Reference: http://neuralnetworksanddeeplearning.com/chap1.html How about recognizing a human face from given set of random images? Attack this problem in the similar fashion explained earlier. Input -> Image pixels, output -> Is it a face or not? (a single node) A face can be recognized by answering some questions like “Is there an eye in the top left?”, “Is there a nose in the middle?” etc.. Each question corresponds to a hidden layer
  • #12: http://ufldl.stanford.edu/wiki/index.php/Autoencoders_and_Sparsity
  • #24: http://deeplearning4j.org/convolutionalnets.html Refined by Lecun in 1989 – mainly to apply CNNs to identify variability in 2D image data. Introduced in 1980 by Fukushima A type of RBMs where the communication is absent across the nodes in the same layer Nodes are not connected to every other node of next layer. Symmetry is not there Convolution networks learn images by pieces rather than learning as a whole (RBM does this) Designed to use minimal amounts of pre processing
  • #26: http://www.idsia.ch/~juergen/rnn.html
  • #27: http://deep-awesomeness.tumblr.com/post/63736448581/sum-product-networks-spm http://lessoned.blogspot.in/2011/10/intro-to-sum-product-networks.html
  • #28: http://en.wikipedia.org/wiki/Hierarchical_temporal_memory