SlideShare a Scribd company logo
Recurrent Neural Network
What’s in it for you?
RNN
What is a Neural Network?
Why Recurrent Neural Network?
Popular Neural Networks
How does a RNN work?
Vanishing and Exploding Gradient Problem
Long Short Term Memory (LSTM)
Use case implementation of LSTM
What is a Recurrent Neural Network?
Introduction to RNN
Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing?
h
y
x
A
B
C
h
y
x
A
B
C
h
y
x
A
B
C
Fed to a Recurrent Neural
Network
What is the best food to
eat in Las
Vegas
Google search
Autocompletes the
search
Introduction to RNN
Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing?
h
y
x
A
B
C
h
y
x
A
B
C
h
y
x
A
B
C
Fed to a Recurrent Neural
Network
What is the best food to
eat in Las
Vegas
Google search
Autocompletes the
search
Collection of large volumes of
most frequently occurring
consecutive words
Introduction to RNN
Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing?
Collection of large volumes of
most frequently occurring
consecutive words
h
y
x
A
B
C
h
y
x
A
B
C
h
y
x
A
B
C
Fed to a Recurrent Neural
Network
What is the best food to
eat in Las
Vegas
Google search
Autocompletes the
search
Analyses the data by finding the
sequence of words occurring
frequently and builds a model to
predict the next word in the sentence
Introduction to RNN
Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing?
Collection of large volumes of
most frequently occurring
consecutive words
h
y
x
A
B
C
h
y
x
A
B
C
h
y
x
A
B
C
Fed to a Recurrent Neural
Network
What is the best food to
eat in Las
Vegas
Google search
Autocompletes the
search
Introduction to RNN
Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing?
Collection of large volumes of
most frequently occurring
consecutive words
h
y
x
A
B
C
h
y
x
A
B
C
h
y
x
A
B
C
Fed to a Recurrent Neural
Network
What is the best food to
eat in Las
Vegas
Google search
Autocompletes the
search
What is a Neural Network?
Neural Networks used in Deep Learning, consists of different layers connected to each other and work on
the structure and functions of a human brain. It learns from huge volumes of data and uses complex
algorithms to train a neural net.
Input Layer Hidden Layers
Output Layer
German Shepherd
Labrador
Image pixels of 2 different breeds
of dog
Identifies the dogs
breed
What is a Neural Network?
Neural Networks used in Deep Learning, consists of different layers connected to each other and work on
the structure and functions of a human brain. It learns from huge volumes of data and uses complex
algorithms to train a neural net.
Hidden Layers
Output Layer
Input Layer
German Shepherd
Labrador
Image pixels of 2 different breeds
of dog
Identifies the dogs
breed
Such networks do not
require memorizing the
past output
Popular Neural Networks
Feed Forward Neural
Network
Used in general Regression and Classification
problems
Convolution Neural
Network
Used for Image Recognition
Deep Neural Network
Used for Acoustic Modeling
Deep Belief Network Used for Cancer Detection
Recurrent Neural
Network
Used for Speech Recognition
Feed Forward Neural Network
Input Layers Hidden Layers Output Layer
x y^
input Predicted
output
Simplified presentation
In a Feed-Forward Network , information flows only in forward direction, from the input nodes, through the hidden layers (if
any) and to the output nodes. There are no cycles or loops in the network.
Input Layer Hidden Layers
Output Layer
• Decisions are based on current input
• No memory about the past
• No future scope
Why Recurrent Neural Network?
Feed Forward Neural
Network
01
02
03
cannot handle
sequential data
cannot memorize
previous inputs
considers only the
current input
Issues in Feed Forward Neural Network
Why Recurrent Neural Network?
h
y
x
A
B
C
Recurrent Neural
Network
01
02
03
can handle
sequential data
can memorize previous
inputs due to its internal
memory
considers the current input and
also the previously received
inputs
Solution to Feed Forward Neural Network
Applications of RNN
Image captioning
RNN is used to caption an image by analyzing the activities present in it
“A Dog catching a ball in mid air”
Applications of RNN
Time series prediction
Any time series problem like predicting the prices of stocks in a particular month can
be solved using RNN
Applications of RNN
Natural Language Processing
Text mining and Sentiment analysis can be carried out using RNN for Natural
Language Processing
When it rains, look for rainbows. When
it’s dark, look for stars.
Positive Sentiment
Applications of RNN
Machine Translation
Given an input in one language, RNN can be used to translate the input into
different languages as output
Here the person is speaking in
English and it is getting translated into
Chinese, Italian, French, German and
Spanish languages
What is a Recurrent Neural Network?
Recurrent Neural Network works on the principle of saving the output of a layer and feeding
this back to the input in order to predict the output of the layer.
h
y
x
A
B
C
Recurrent Neural Network
x
x
x
h
h
h
h
h
h
h
h
y
y
Input Layer Hidden Layers Output Layer
www.simplilearn.com
How does a RNN look like?
A, B and C are the parameters
How does a RNN look like?
www.simplilearn.com
How does a RNN work?
= new state
fc = function with parameter c
= old state
= input vector at time step t
= f (h(t-1) ,c )
C
A
B
C
A
B
C
A
B
y(t-1) y(t+1)y(t)
x(t-1) x(t+1)
h(t-1) h(t+1)
h(t)
h(t)
x(t)
x(t)
h(t)
h(t-1)
x(t)
Types of Recurrent Neural Network
one to one
Single output
Single input
one to one network is known as the Vanilla Neural
Network. Used for regular machine learning
problems
Types of Recurrent Neural Network
one to many
Multiple outputs
Single input
one to many network generates sequence of
outputs. Example: Image captioning
Types of Recurrent Neural Network
many to one
Single output
Multiple inputs
many to one network takes in a sequence of
inputs. Example: Sentiment analysis where a
given sentence can be classified as expressing
positive or negative sentiments
Types of Recurrent Neural Network
many to many
Multiple outputs
Multiple inputs
many to many network takes in a sequence of
inputs and generates a sequence of outputs.
Example: Machine Translation
Vanishing Gradient Problem
While training a RNN, your slope can be either too
small or very large and this makes training difficult.
When the slope is too small, the problem is know as
Vanishing gradient.
Change in Y
Change in X
Y
X
s0
E0
x0
s1
E1
s2
E2
x2x1
s3
E3
x3
Loss of Information through time
Backpropagate the error
Exploding Gradient Problem
When the slope tends to grow exponentially
instead of decaying, this problem is called
Exploding gradient.
Y
X
• Long training time
• Poor performance
• Bad accuracys0
E0
x0
s1
E1
s2
E2
x2x1
s3
E3
x3
Loss of Information through time
Backpropagate the error
Issues in Gradient Problem
Explaining Gradient Problem
s1
y1
x1
s2
y2
s3
y3
x3x2
st
yt
xt
…….s0
Consider the following 2 examples to understand what should be the next word in the sequence:
The person who took my bike and…………………………………………., a thief.
The students who got into Engineering with …………………………………………., from Asia.
____
____
was
were
Explaining Gradient Problem
s1
y1
x1
s2
y2
s3
y3
x3x2
st
yt
xt
…….s0
Consider the following 2 examples:
The person who took my bike and…………………………………………., a thief.
The students who got into Engineering with …………………………………………., were from Asia .
____was
In order to understand what would be the next word in the sequence, the RNN must memorize the
previous context whether the subject was singular noun or plural noun
Explaining Gradient Problem
Consider the following 2 examples:
The person who took my bike and…………………………………………., was a thief.
The students who got into Engineering with …………………………………………., from Asia .
In order to understand what would be the next word in the sequence, the RNN must memorize the
previous context whether the subject was singular noun or plural noun
____were
It might be sometimes difficult for
the error to backpropagate to the
beginning of the sequence to
predict what should be the output
s1
y1
x1
s2
y2
s3
y3
x3x2
st
yt
xt
…….s0
Solution to Gradient Problem
Identity
Initialization
Truncated
Backpropagation
Gradient
Clipping
2
1 3
Exploding Gradient
Weight
initialization
Choosing the right Activation
Function
Long Short-Term
Memory Networks
(LSTMs)
2
1 3
Vanishing Gradient
Long-Term Dependencies
Here we do not need any further context. It’s pretty
clear the last word is going to be “sky”.
Suppose we try to predict the
last word in the text
“ The clouds are in the ___”sky
Long-Term Dependencies
Suppose we try to predict the
last word in the text
“I have been staying in Spain for the last 10 years… I
can speak fluent ______.”
• Here we need the context of Spain to predict the last word in
the text.
• It’s possible that the gap between the relevant information
and the point where it is needed to become very large.
• LSTMs help us solve this problem.
Spanish
The word we predict will depend on the
previous few words in context
Long Short-Term Memory Networks
LSTMs are special kind of Recurrent Neural Networks, capable of learning long-term dependencies. Remembering
information for long periods of time is their default behavior.
tanh
ht-1
xt-1
tanh
ht
xt
tanh
ht+1
Xt+1
A AA
All recurrent neural networks have the form of a chain of repeating modules of neural network. In standard RNNs, this repeating
module will have a very simple structure, such as a single tanh layer.
Long Short-Term Memory Networks
LSTMs are special kind of Recurrent Neural Networks, capable of learning long-term dependencies. Remembering
information for long periods of time is their default behavior.
+x
tanh
x
tanh
x
+x
tanh
x
tanh
x
+x
tanh
x
tanh
x
ht-1 ht Ht+1
xt-1
xt Xt+1
A A
LSTMs also have a chain like structure, but the repeating module has a different structure. Instead of having a single neural
network layer, there are four interacting layers communicating in a very special way.
Long Short-Term Memory Networks
3 step process of LSTMs
Step 1 Step 2 Step 3
Forget irrelevant parts
of previous state
Selectively update cell
state values
Output certain parts of
cell state
Ct-1 Ct
Long Short-Term Memory Networks
3 step process of LSTMs
Step 1 Step 2 Step 3
Forget irrelevant parts
of previous state
Selectively update cell
state values
Output certain parts of
cell state
Ct-1 Ct
Long Short-Term Memory Networks
3 step process of LSTMs
Step 1 Step 2 Step 3
Forget irrelevant parts
of previous state
Selectively update cell
state values
Output certain parts of
cell state
Ct-1 Ct
ht
Ct
Working of LSTMs
First step in the LSTM is to decide which information to be omitted in from
the cell in that particular time step. It is decided by the sigmoid function. It
looks at the previous state (ht-1 ) and the current input xt and computes the
function.
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
ft = forget gate
Decides which information to delete
that is not important from previous
time step
Decides how much of the past it should rememberStep-1
ht
Ct
Working of LSTMs
First step in the LSTM is to decide which information to be omitted in from
the cell in that particular time step. It is decided by the sigmoid function. It
looks at the previous state (ht-1 ) and the current input xt and computes the
function.
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
ft = forget gate
Decides which information to delete
that is not important from previous
time step
Consider an LSTM is fed with the following inputs from previous and present time step :
Alice is good in Physics. John on the other hand is good in Chemistry.
John plays football well. He told me yesterday over phone that he had served as
the captain of his college football team.
Previous Output
Current Input
ht-1
xt
Forget gate realizes there might be a change in context after
encountering the first full stop.
Compares with the current input sentence at xt
The next sentence talks about John, so the information on
Alice is deleted.
The position of subject is vacated and is assigned to John
ht
Ct
Working of LSTMs
First step in the LSTM is to decide which information to be omitted in from
the cell in that particular time step. It is decided by the sigmoid function. It
looks at the previous state (ht-1 ) and the current input xt and computes the
function.
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
ft = forget gate
Decides which information to delete
that is not important from previous
time step
Consider an LSTM is fed with the following inputs from previous and present time step :
Alice is good in Physics. John on the other hand is good in Chemistry.
John plays football well. He told me yesterday over phone that he had served as
the captain of his college football team.
Previous Output
Current Input
ht-1
xt
Forget gate realizes there might be a change in context after
encountering the first full stop.
Compares with the current input sentence at xt
The next sentence talks about John, so the information on
Alice is deleted.
The position of subject is vacated and is assigned to John
ht
Ct
Working of LSTMs
First step in the LSTM is to decide which information to be omitted in from
the cell in that particular time step. It is decided by the sigmoid function. It
looks at the previous state (ht-1 ) and the current input xt and computes the
function.
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
ft = forget gate
Decides which information to delete
that is not important from previous
time step
Consider an LSTM is fed with the following inputs from previous and present time step :
Alice is good in Physics. John on the other hand is good in Chemistry.
John plays football well. He told me yesterday over phone that he had served as
the captain of his college football team.
Previous Output
Current Input
ht-1
xt
Forget gate realizes there might be a change in context after
encountering the first full stop.
Compares with the current input sentence at xt
The next sentence talks about John, so the information on
Alice is deleted.
The position of subject is vacated and is assigned to John
ht
Ct
Working of LSTMs
First step in the LSTM is to decide which information to be omitted in from
the cell in that particular time step. It is decided by the sigmoid function. It
looks at the previous state (ht-1 ) and the current input xt and computes the
function.
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
ft = forget gate
Decides which information to delete
that is not important from previous
time step
Consider an LSTM is fed with the following inputs from previous and present time step :
Alice is good in Physics. John on the other hand is good in Chemistry.
John plays football well. He told me yesterday over phone that he had served as
the captain of his college football team.
Previous Output
Current Input
ht-1
xt
Forget gate realizes there might be a change in context after
encountering the first full stop.
Compares with the current input sentence at xt
The next sentence talks about John, so the information on
Alice is deleted.
The position of subject is vacated and is assigned to John
ht
Ct
Working of LSTMs
First step in the LSTM is to decide which information to be omitted in from
the cell in that particular time step. It is decided by the sigmoid function. It
looks at the previous state (ht-1 ) and the current input xt and computes the
function.
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
ft = forget gate
Decides which information to delete
that is not important from previous
time step
Consider an LSTM is fed with the following inputs from previous and present time step :
Alice is good in Physics. John on the other hand is good in Chemistry.
John plays football well. He told me yesterday over phone that he had served as
the captain of his college football team.
Previous Output
Current Input
ht-1
xt
Forget gate realizes there might be a change in context after
encountering the first full stop.
Compares with the current input sentence at xt
The next sentence talks about John, so the information on
Alice is deleted.
The position of subject is vacated and is assigned to John
ht
Ct
Working of LSTMs
In the second layer, there are 2 parts. One is the sigmoid function and the
other is the tanh. In the sigmoid function, it decides which values to let
through(0 or 1). tanh function gives the weightage to the values which are
passed deciding their level of importance(-1 to 1).
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
Decides how much should this unit add to the current stateStep-2
it = input gate
Determines which information to let
through based on its significance in the
current time step
ht
Ct
Working of LSTMs
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
Consider the current input at xt
John plays football well. He told me yesterday over phone that he had served as
the captain of his college football team. Current Input
Input gate analyses
the important
information
ht
Ct
Working of LSTMs
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
Consider the current input at xt
John plays football well. He told me yesterday over phone that he had served as
the captain of his college football team. Current Input
John plays football
and he was the
captain of his college
team is important
ht
Ct
Working of LSTMs
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
Consider the current input at xt
John plays football well. He told me yesterday over phone that he had served as
the captain of his college football team. Current Input
He told me over
phone yesterday is
less important, hence
it is forgotten
ht
Ct
Working of LSTMs
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
Consider the current input at xt
John plays football well. He told me yesterday over phone that he had served as
the captain of his college football team. Current Input
This process of
adding some new
information can be
done via
the input gate
Working of LSTMs
The third step is to decide what will be our output. First, we run a sigmoid
layer which decides what parts of the cell state make it to the output.
Then, we put the cell state through tanh to push the values to be between
-1 and 1 and multiply it by the output of the sigmoid gate.
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
ht
ot
Ct
Decides what part of the current cell state makes it to the outputStep-3
ot = output gate
Allows the passed in information to
impact the output in the current time step
Working of LSTMs
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
ht
ot
Ct
Let’s consider this example to predicting the next word in the sentence:
John played tremendously well against the opponent and won for his team. For his
contributions, brave ____ was awarded player of the match.
There could be a lot
of choices for the
empty space
Working of LSTMs
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
ht
ot
Ct
Let’s consider this example to predicting the next word in the sentence:
John played tremendously well against the opponent and won for his team. For his
contributions, brave ____ was awarded player of the match.
Current input brave is
an adjective
Working of LSTMs
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
ht
ot
Ct
Let’s consider this example to predicting the next word in the sentence:
John played tremendously well against the opponent and won for his team. For his
contributions, brave ____ was awarded player of the match.
Adjectives describe a
noun
Working of LSTMs
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
ht
ot
Ct
Let’s consider this example to predicting the next word in the sentence:
John played tremendously well against the opponent and won for his team. For his
contributions, brave ____ was awarded player of the match.John
“John” could be the
best output after
brave
Use case implementation of LSTM
Let’s predict the prices of stocks using LSTM network
Based on the stock price
data between 2012-
2016
Predict the stock prices of 2017
Use case implementation of LSTM
1. Import the Libraries
2. Import the training dataset
3. Feature Scaling
Use case implementation of LSTM
4. Create a data structure with 60 timesteps and 1 output
5. Import keras libraries and packages
Use case implementation of LSTM
6. Initialize the RNN
7. Adding the LSTM layers and some Dropout regularization
Use case implementation of LSTM
8. Adding the output layer
9. Compile the RNN
10. Fit the RNN to the training set
Use case implementation of LSTM
11. Load the stock price test data for 2017
12. Get the predicted stock price of 2017
Use case implementation of LSTM
13. Visualize the results of predicted and real stock price
Key Takeaways
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | Simplilearn

More Related Content

What's hot (20)

PPTX
Convolutional Neural Network and Its Applications
Kasun Chinthaka Piyarathna
 
PPT
K mean-clustering algorithm
parry prabhu
 
PDF
Convolutional Neural Networks (CNN)
Gaurav Mittal
 
PPTX
Generative Adversarial Network (GAN)
Prakhar Rastogi
 
PPTX
Deep Learning - CNN and RNN
Ashray Bhandare
 
PDF
Deep Learning - Convolutional Neural Networks
Christian Perone
 
PPTX
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Simplilearn
 
PPT
Reinforcement learning
Chandra Meena
 
PDF
An introduction to Deep Learning
Julien SIMON
 
PDF
Convolutional Neural Network Models - Deep Learning
Mohamed Loey
 
PDF
Convolutional neural network
Yan Xu
 
PPTX
Convolutional Neural Network (CNN)
Muhammad Haroon
 
PPTX
Rnn & Lstm
Subash Chandra Pakhrin
 
PPTX
An overview of gradient descent optimization algorithms
Hakky St
 
PPTX
Convolution Neural Network (CNN)
Suraj Aavula
 
PPTX
Introduction to Deep learning
leopauly
 
PPTX
Autoencoders in Deep Learning
milad abbasi
 
PDF
Introduction to Neural Networks
Databricks
 
PPTX
Transfer learning-presentation
Bushra Jbawi
 
PPT
Adaptive Resonance Theory
Naveen Kumar
 
Convolutional Neural Network and Its Applications
Kasun Chinthaka Piyarathna
 
K mean-clustering algorithm
parry prabhu
 
Convolutional Neural Networks (CNN)
Gaurav Mittal
 
Generative Adversarial Network (GAN)
Prakhar Rastogi
 
Deep Learning - CNN and RNN
Ashray Bhandare
 
Deep Learning - Convolutional Neural Networks
Christian Perone
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Simplilearn
 
Reinforcement learning
Chandra Meena
 
An introduction to Deep Learning
Julien SIMON
 
Convolutional Neural Network Models - Deep Learning
Mohamed Loey
 
Convolutional neural network
Yan Xu
 
Convolutional Neural Network (CNN)
Muhammad Haroon
 
An overview of gradient descent optimization algorithms
Hakky St
 
Convolution Neural Network (CNN)
Suraj Aavula
 
Introduction to Deep learning
leopauly
 
Autoencoders in Deep Learning
milad abbasi
 
Introduction to Neural Networks
Databricks
 
Transfer learning-presentation
Bushra Jbawi
 
Adaptive Resonance Theory
Naveen Kumar
 

Similar to Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | Simplilearn (20)

PPTX
Recurrent Neural Network
Mohammad Sabouri
 
PPTX
Complete solution for Recurrent neural network.pptx
ArunKumar674066
 
PDF
Deep Learning: Application & Opportunity
iTrain
 
PDF
A Brief Introduction on Recurrent Neural Network and Its Application
Xiaohu ZHU
 
PPT
14889574 dl ml RNN Deeplearning MMMm.ppt
ManiMaran230751
 
PDF
Recurrent Neural Networks
CloudxLab
 
PPTX
10.0 SequenceModeling-merged-compressed_edited.pptx
ykchia03
 
PDF
An In-Depth Explanation of Recurrent Neural Networks (RNNs) - InsideAIML
VijaySharma802
 
PPTX
Introduction to deep learning
Junaid Bhat
 
PDF
EXPERIMENTS ON DIFFERENT RECURRENT NEURAL NETWORKS FOR ENGLISH-HINDI MACHINE ...
csandit
 
PPTX
recurrent_neural_networks_april_2020.pptx
SagarTekwani4
 
PPTX
RNN and LSTM model description and working advantages and disadvantages
AbhijitVenkatesh1
 
PDF
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
MLconf
 
PDF
Recurrent neural networks rnn
Kuppusamy P
 
PDF
rnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
khushbu maurya
 
PDF
IRJET- Survey on Text Error Detection using Deep Learning
IRJET Journal
 
PDF
Convolutional and Recurrent Neural Networks
Ramesh Ragala
 
PDF
Sequence Modelling with Deep Learning
Natasha Latysheva
 
PDF
DEEPLEARNING recurrent neural networs.pdf
AamirMaqsood8
 
PPTX
Recurrent-Neural-Networks-Mastering-Sequences-in-1.pptx
wininlifeacademy5
 
Recurrent Neural Network
Mohammad Sabouri
 
Complete solution for Recurrent neural network.pptx
ArunKumar674066
 
Deep Learning: Application & Opportunity
iTrain
 
A Brief Introduction on Recurrent Neural Network and Its Application
Xiaohu ZHU
 
14889574 dl ml RNN Deeplearning MMMm.ppt
ManiMaran230751
 
Recurrent Neural Networks
CloudxLab
 
10.0 SequenceModeling-merged-compressed_edited.pptx
ykchia03
 
An In-Depth Explanation of Recurrent Neural Networks (RNNs) - InsideAIML
VijaySharma802
 
Introduction to deep learning
Junaid Bhat
 
EXPERIMENTS ON DIFFERENT RECURRENT NEURAL NETWORKS FOR ENGLISH-HINDI MACHINE ...
csandit
 
recurrent_neural_networks_april_2020.pptx
SagarTekwani4
 
RNN and LSTM model description and working advantages and disadvantages
AbhijitVenkatesh1
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
MLconf
 
Recurrent neural networks rnn
Kuppusamy P
 
rnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
khushbu maurya
 
IRJET- Survey on Text Error Detection using Deep Learning
IRJET Journal
 
Convolutional and Recurrent Neural Networks
Ramesh Ragala
 
Sequence Modelling with Deep Learning
Natasha Latysheva
 
DEEPLEARNING recurrent neural networs.pdf
AamirMaqsood8
 
Recurrent-Neural-Networks-Mastering-Sequences-in-1.pptx
wininlifeacademy5
 
Ad

More from Simplilearn (20)

PPTX
Top 50 Scrum Master Interview Questions | Scrum Master Interview Questions & ...
Simplilearn
 
PPTX
Bagging Vs Boosting In Machine Learning | Ensemble Learning In Machine Learni...
Simplilearn
 
PPTX
Future Of Social Media | Social Media Trends and Strategies 2025 | Instagram ...
Simplilearn
 
PPTX
SQL Query Optimization | SQL Query Optimization Techniques | SQL Basics | SQL...
Simplilearn
 
PPTX
SQL INterview Questions .pTop 45 SQL Interview Questions And Answers In 2025 ...
Simplilearn
 
PPTX
How To Start Influencer Marketing Business | Influencer Marketing For Beginne...
Simplilearn
 
PPTX
Cyber Security Roadmap 2025 | How To Become Cyber Security Engineer In 2025 |...
Simplilearn
 
PPTX
How To Become An AI And ML Engineer In 2025 | AI Engineer Roadmap | AI ML Car...
Simplilearn
 
PPTX
What Is GitHub Copilot? | How To Use GitHub Copilot? | How does GitHub Copilo...
Simplilearn
 
PPTX
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Simplilearn
 
PPTX
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Simplilearn
 
PPTX
Top 7 High Paying AI Certifications Courses For 2025 | Best AI Certifications...
Simplilearn
 
PPTX
Data Cleaning In Data Mining | Step by Step Data Cleaning Process | Data Clea...
Simplilearn
 
PPTX
Top 10 Data Analyst Projects For 2025 | Data Analyst Projects | Data Analysis...
Simplilearn
 
PPTX
AI Engineer Roadmap 2025 | AI Engineer Roadmap For Beginners | AI Engineer Ca...
Simplilearn
 
PPTX
Machine Learning Roadmap 2025 | Machine Learning Engineer Roadmap For Beginne...
Simplilearn
 
PPTX
Kotter's 8-Step Change Model Explained | Kotter's Change Management Model | S...
Simplilearn
 
PPTX
Gen AI Engineer Roadmap For 2025 | How To Become Gen AI Engineer In 2025 | Si...
Simplilearn
 
PPTX
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Simplilearn
 
PPTX
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Simplilearn
 
Top 50 Scrum Master Interview Questions | Scrum Master Interview Questions & ...
Simplilearn
 
Bagging Vs Boosting In Machine Learning | Ensemble Learning In Machine Learni...
Simplilearn
 
Future Of Social Media | Social Media Trends and Strategies 2025 | Instagram ...
Simplilearn
 
SQL Query Optimization | SQL Query Optimization Techniques | SQL Basics | SQL...
Simplilearn
 
SQL INterview Questions .pTop 45 SQL Interview Questions And Answers In 2025 ...
Simplilearn
 
How To Start Influencer Marketing Business | Influencer Marketing For Beginne...
Simplilearn
 
Cyber Security Roadmap 2025 | How To Become Cyber Security Engineer In 2025 |...
Simplilearn
 
How To Become An AI And ML Engineer In 2025 | AI Engineer Roadmap | AI ML Car...
Simplilearn
 
What Is GitHub Copilot? | How To Use GitHub Copilot? | How does GitHub Copilo...
Simplilearn
 
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Simplilearn
 
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Simplilearn
 
Top 7 High Paying AI Certifications Courses For 2025 | Best AI Certifications...
Simplilearn
 
Data Cleaning In Data Mining | Step by Step Data Cleaning Process | Data Clea...
Simplilearn
 
Top 10 Data Analyst Projects For 2025 | Data Analyst Projects | Data Analysis...
Simplilearn
 
AI Engineer Roadmap 2025 | AI Engineer Roadmap For Beginners | AI Engineer Ca...
Simplilearn
 
Machine Learning Roadmap 2025 | Machine Learning Engineer Roadmap For Beginne...
Simplilearn
 
Kotter's 8-Step Change Model Explained | Kotter's Change Management Model | S...
Simplilearn
 
Gen AI Engineer Roadmap For 2025 | How To Become Gen AI Engineer In 2025 | Si...
Simplilearn
 
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Simplilearn
 
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Simplilearn
 
Ad

Recently uploaded (20)

PDF
Biological Bilingual Glossary Hindi and English Medium
World of Wisdom
 
PPTX
Stereochemistry-Optical Isomerism in organic compoundsptx
Tarannum Nadaf-Mansuri
 
PDF
Knee Extensor Mechanism Injuries - Orthopedic Radiologic Imaging
Sean M. Fox
 
PDF
QNL June Edition hosted by Pragya the official Quiz Club of the University of...
Pragya - UEM Kolkata Quiz Club
 
PPTX
HUMAN RESOURCE MANAGEMENT: RECRUITMENT, SELECTION, PLACEMENT, DEPLOYMENT, TRA...
PRADEEP ABOTHU
 
PPTX
QUARTER 1 WEEK 2 PLOT, POV AND CONFLICTS
KynaParas
 
PPTX
Neurodivergent Friendly Schools - Slides from training session
Pooky Knightsmith
 
PPTX
How to Handle Salesperson Commision in Odoo 18 Sales
Celine George
 
PDF
Dimensions of Societal Planning in Commonism
StefanMz
 
PDF
Reconstruct, Restore, Reimagine: New Perspectives on Stoke Newington’s Histor...
History of Stoke Newington
 
PPTX
Quarter 1_PPT_PE & HEALTH 8_WEEK 3-4.pptx
ronajadolpnhs
 
PDF
Isharyanti-2025-Cross Language Communication in Indonesian Language
Neny Isharyanti
 
PPTX
How to Set Up Tags in Odoo 18 - Odoo Slides
Celine George
 
PPTX
CATEGORIES OF NURSING PERSONNEL: HOSPITAL & COLLEGE
PRADEEP ABOTHU
 
PDF
Governor Josh Stein letter to NC delegation of U.S. House
Mebane Rash
 
PPTX
How to Convert an Opportunity into a Quotation in Odoo 18 CRM
Celine George
 
PPTX
Identifying elements in the story. Arrange the events in the story
geraldineamahido2
 
PDF
Horarios de distribución de agua en julio
pegazohn1978
 
PDF
Stokey: A Jewish Village by Rachel Kolsky
History of Stoke Newington
 
PPTX
How to Create a PDF Report in Odoo 18 - Odoo Slides
Celine George
 
Biological Bilingual Glossary Hindi and English Medium
World of Wisdom
 
Stereochemistry-Optical Isomerism in organic compoundsptx
Tarannum Nadaf-Mansuri
 
Knee Extensor Mechanism Injuries - Orthopedic Radiologic Imaging
Sean M. Fox
 
QNL June Edition hosted by Pragya the official Quiz Club of the University of...
Pragya - UEM Kolkata Quiz Club
 
HUMAN RESOURCE MANAGEMENT: RECRUITMENT, SELECTION, PLACEMENT, DEPLOYMENT, TRA...
PRADEEP ABOTHU
 
QUARTER 1 WEEK 2 PLOT, POV AND CONFLICTS
KynaParas
 
Neurodivergent Friendly Schools - Slides from training session
Pooky Knightsmith
 
How to Handle Salesperson Commision in Odoo 18 Sales
Celine George
 
Dimensions of Societal Planning in Commonism
StefanMz
 
Reconstruct, Restore, Reimagine: New Perspectives on Stoke Newington’s Histor...
History of Stoke Newington
 
Quarter 1_PPT_PE & HEALTH 8_WEEK 3-4.pptx
ronajadolpnhs
 
Isharyanti-2025-Cross Language Communication in Indonesian Language
Neny Isharyanti
 
How to Set Up Tags in Odoo 18 - Odoo Slides
Celine George
 
CATEGORIES OF NURSING PERSONNEL: HOSPITAL & COLLEGE
PRADEEP ABOTHU
 
Governor Josh Stein letter to NC delegation of U.S. House
Mebane Rash
 
How to Convert an Opportunity into a Quotation in Odoo 18 CRM
Celine George
 
Identifying elements in the story. Arrange the events in the story
geraldineamahido2
 
Horarios de distribución de agua en julio
pegazohn1978
 
Stokey: A Jewish Village by Rachel Kolsky
History of Stoke Newington
 
How to Create a PDF Report in Odoo 18 - Odoo Slides
Celine George
 

Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | Simplilearn

  • 2. What’s in it for you? RNN What is a Neural Network? Why Recurrent Neural Network? Popular Neural Networks How does a RNN work? Vanishing and Exploding Gradient Problem Long Short Term Memory (LSTM) Use case implementation of LSTM What is a Recurrent Neural Network?
  • 3. Introduction to RNN Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing? h y x A B C h y x A B C h y x A B C Fed to a Recurrent Neural Network What is the best food to eat in Las Vegas Google search Autocompletes the search
  • 4. Introduction to RNN Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing? h y x A B C h y x A B C h y x A B C Fed to a Recurrent Neural Network What is the best food to eat in Las Vegas Google search Autocompletes the search Collection of large volumes of most frequently occurring consecutive words
  • 5. Introduction to RNN Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing? Collection of large volumes of most frequently occurring consecutive words h y x A B C h y x A B C h y x A B C Fed to a Recurrent Neural Network What is the best food to eat in Las Vegas Google search Autocompletes the search Analyses the data by finding the sequence of words occurring frequently and builds a model to predict the next word in the sentence
  • 6. Introduction to RNN Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing? Collection of large volumes of most frequently occurring consecutive words h y x A B C h y x A B C h y x A B C Fed to a Recurrent Neural Network What is the best food to eat in Las Vegas Google search Autocompletes the search
  • 7. Introduction to RNN Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing? Collection of large volumes of most frequently occurring consecutive words h y x A B C h y x A B C h y x A B C Fed to a Recurrent Neural Network What is the best food to eat in Las Vegas Google search Autocompletes the search
  • 8. What is a Neural Network? Neural Networks used in Deep Learning, consists of different layers connected to each other and work on the structure and functions of a human brain. It learns from huge volumes of data and uses complex algorithms to train a neural net. Input Layer Hidden Layers Output Layer German Shepherd Labrador Image pixels of 2 different breeds of dog Identifies the dogs breed
  • 9. What is a Neural Network? Neural Networks used in Deep Learning, consists of different layers connected to each other and work on the structure and functions of a human brain. It learns from huge volumes of data and uses complex algorithms to train a neural net. Hidden Layers Output Layer Input Layer German Shepherd Labrador Image pixels of 2 different breeds of dog Identifies the dogs breed Such networks do not require memorizing the past output
  • 10. Popular Neural Networks Feed Forward Neural Network Used in general Regression and Classification problems Convolution Neural Network Used for Image Recognition Deep Neural Network Used for Acoustic Modeling Deep Belief Network Used for Cancer Detection Recurrent Neural Network Used for Speech Recognition
  • 11. Feed Forward Neural Network Input Layers Hidden Layers Output Layer x y^ input Predicted output Simplified presentation In a Feed-Forward Network , information flows only in forward direction, from the input nodes, through the hidden layers (if any) and to the output nodes. There are no cycles or loops in the network. Input Layer Hidden Layers Output Layer • Decisions are based on current input • No memory about the past • No future scope
  • 12. Why Recurrent Neural Network? Feed Forward Neural Network 01 02 03 cannot handle sequential data cannot memorize previous inputs considers only the current input Issues in Feed Forward Neural Network
  • 13. Why Recurrent Neural Network? h y x A B C Recurrent Neural Network 01 02 03 can handle sequential data can memorize previous inputs due to its internal memory considers the current input and also the previously received inputs Solution to Feed Forward Neural Network
  • 14. Applications of RNN Image captioning RNN is used to caption an image by analyzing the activities present in it “A Dog catching a ball in mid air”
  • 15. Applications of RNN Time series prediction Any time series problem like predicting the prices of stocks in a particular month can be solved using RNN
  • 16. Applications of RNN Natural Language Processing Text mining and Sentiment analysis can be carried out using RNN for Natural Language Processing When it rains, look for rainbows. When it’s dark, look for stars. Positive Sentiment
  • 17. Applications of RNN Machine Translation Given an input in one language, RNN can be used to translate the input into different languages as output Here the person is speaking in English and it is getting translated into Chinese, Italian, French, German and Spanish languages
  • 18. What is a Recurrent Neural Network? Recurrent Neural Network works on the principle of saving the output of a layer and feeding this back to the input in order to predict the output of the layer. h y x A B C Recurrent Neural Network x x x h h h h h h h h y y Input Layer Hidden Layers Output Layer
  • 19. www.simplilearn.com How does a RNN look like? A, B and C are the parameters
  • 20. How does a RNN look like? www.simplilearn.com
  • 21. How does a RNN work? = new state fc = function with parameter c = old state = input vector at time step t = f (h(t-1) ,c ) C A B C A B C A B y(t-1) y(t+1)y(t) x(t-1) x(t+1) h(t-1) h(t+1) h(t) h(t) x(t) x(t) h(t) h(t-1) x(t)
  • 22. Types of Recurrent Neural Network one to one Single output Single input one to one network is known as the Vanilla Neural Network. Used for regular machine learning problems
  • 23. Types of Recurrent Neural Network one to many Multiple outputs Single input one to many network generates sequence of outputs. Example: Image captioning
  • 24. Types of Recurrent Neural Network many to one Single output Multiple inputs many to one network takes in a sequence of inputs. Example: Sentiment analysis where a given sentence can be classified as expressing positive or negative sentiments
  • 25. Types of Recurrent Neural Network many to many Multiple outputs Multiple inputs many to many network takes in a sequence of inputs and generates a sequence of outputs. Example: Machine Translation
  • 26. Vanishing Gradient Problem While training a RNN, your slope can be either too small or very large and this makes training difficult. When the slope is too small, the problem is know as Vanishing gradient. Change in Y Change in X Y X s0 E0 x0 s1 E1 s2 E2 x2x1 s3 E3 x3 Loss of Information through time Backpropagate the error
  • 27. Exploding Gradient Problem When the slope tends to grow exponentially instead of decaying, this problem is called Exploding gradient. Y X • Long training time • Poor performance • Bad accuracys0 E0 x0 s1 E1 s2 E2 x2x1 s3 E3 x3 Loss of Information through time Backpropagate the error Issues in Gradient Problem
  • 28. Explaining Gradient Problem s1 y1 x1 s2 y2 s3 y3 x3x2 st yt xt …….s0 Consider the following 2 examples to understand what should be the next word in the sequence: The person who took my bike and…………………………………………., a thief. The students who got into Engineering with …………………………………………., from Asia. ____ ____ was were
  • 29. Explaining Gradient Problem s1 y1 x1 s2 y2 s3 y3 x3x2 st yt xt …….s0 Consider the following 2 examples: The person who took my bike and…………………………………………., a thief. The students who got into Engineering with …………………………………………., were from Asia . ____was In order to understand what would be the next word in the sequence, the RNN must memorize the previous context whether the subject was singular noun or plural noun
  • 30. Explaining Gradient Problem Consider the following 2 examples: The person who took my bike and…………………………………………., was a thief. The students who got into Engineering with …………………………………………., from Asia . In order to understand what would be the next word in the sequence, the RNN must memorize the previous context whether the subject was singular noun or plural noun ____were It might be sometimes difficult for the error to backpropagate to the beginning of the sequence to predict what should be the output s1 y1 x1 s2 y2 s3 y3 x3x2 st yt xt …….s0
  • 31. Solution to Gradient Problem Identity Initialization Truncated Backpropagation Gradient Clipping 2 1 3 Exploding Gradient Weight initialization Choosing the right Activation Function Long Short-Term Memory Networks (LSTMs) 2 1 3 Vanishing Gradient
  • 32. Long-Term Dependencies Here we do not need any further context. It’s pretty clear the last word is going to be “sky”. Suppose we try to predict the last word in the text “ The clouds are in the ___”sky
  • 33. Long-Term Dependencies Suppose we try to predict the last word in the text “I have been staying in Spain for the last 10 years… I can speak fluent ______.” • Here we need the context of Spain to predict the last word in the text. • It’s possible that the gap between the relevant information and the point where it is needed to become very large. • LSTMs help us solve this problem. Spanish The word we predict will depend on the previous few words in context
  • 34. Long Short-Term Memory Networks LSTMs are special kind of Recurrent Neural Networks, capable of learning long-term dependencies. Remembering information for long periods of time is their default behavior. tanh ht-1 xt-1 tanh ht xt tanh ht+1 Xt+1 A AA All recurrent neural networks have the form of a chain of repeating modules of neural network. In standard RNNs, this repeating module will have a very simple structure, such as a single tanh layer.
  • 35. Long Short-Term Memory Networks LSTMs are special kind of Recurrent Neural Networks, capable of learning long-term dependencies. Remembering information for long periods of time is their default behavior. +x tanh x tanh x +x tanh x tanh x +x tanh x tanh x ht-1 ht Ht+1 xt-1 xt Xt+1 A A LSTMs also have a chain like structure, but the repeating module has a different structure. Instead of having a single neural network layer, there are four interacting layers communicating in a very special way.
  • 36. Long Short-Term Memory Networks 3 step process of LSTMs Step 1 Step 2 Step 3 Forget irrelevant parts of previous state Selectively update cell state values Output certain parts of cell state Ct-1 Ct
  • 37. Long Short-Term Memory Networks 3 step process of LSTMs Step 1 Step 2 Step 3 Forget irrelevant parts of previous state Selectively update cell state values Output certain parts of cell state Ct-1 Ct
  • 38. Long Short-Term Memory Networks 3 step process of LSTMs Step 1 Step 2 Step 3 Forget irrelevant parts of previous state Selectively update cell state values Output certain parts of cell state Ct-1 Ct
  • 39. ht Ct Working of LSTMs First step in the LSTM is to decide which information to be omitted in from the cell in that particular time step. It is decided by the sigmoid function. It looks at the previous state (ht-1 ) and the current input xt and computes the function. +x tanh x tanh x ht xt Ct-1 ht-1 ft ft = forget gate Decides which information to delete that is not important from previous time step Decides how much of the past it should rememberStep-1
  • 40. ht Ct Working of LSTMs First step in the LSTM is to decide which information to be omitted in from the cell in that particular time step. It is decided by the sigmoid function. It looks at the previous state (ht-1 ) and the current input xt and computes the function. +x tanh x tanh x ht xt Ct-1 ht-1 ft ft = forget gate Decides which information to delete that is not important from previous time step Consider an LSTM is fed with the following inputs from previous and present time step : Alice is good in Physics. John on the other hand is good in Chemistry. John plays football well. He told me yesterday over phone that he had served as the captain of his college football team. Previous Output Current Input ht-1 xt Forget gate realizes there might be a change in context after encountering the first full stop. Compares with the current input sentence at xt The next sentence talks about John, so the information on Alice is deleted. The position of subject is vacated and is assigned to John
  • 41. ht Ct Working of LSTMs First step in the LSTM is to decide which information to be omitted in from the cell in that particular time step. It is decided by the sigmoid function. It looks at the previous state (ht-1 ) and the current input xt and computes the function. +x tanh x tanh x ht xt Ct-1 ht-1 ft ft = forget gate Decides which information to delete that is not important from previous time step Consider an LSTM is fed with the following inputs from previous and present time step : Alice is good in Physics. John on the other hand is good in Chemistry. John plays football well. He told me yesterday over phone that he had served as the captain of his college football team. Previous Output Current Input ht-1 xt Forget gate realizes there might be a change in context after encountering the first full stop. Compares with the current input sentence at xt The next sentence talks about John, so the information on Alice is deleted. The position of subject is vacated and is assigned to John
  • 42. ht Ct Working of LSTMs First step in the LSTM is to decide which information to be omitted in from the cell in that particular time step. It is decided by the sigmoid function. It looks at the previous state (ht-1 ) and the current input xt and computes the function. +x tanh x tanh x ht xt Ct-1 ht-1 ft ft = forget gate Decides which information to delete that is not important from previous time step Consider an LSTM is fed with the following inputs from previous and present time step : Alice is good in Physics. John on the other hand is good in Chemistry. John plays football well. He told me yesterday over phone that he had served as the captain of his college football team. Previous Output Current Input ht-1 xt Forget gate realizes there might be a change in context after encountering the first full stop. Compares with the current input sentence at xt The next sentence talks about John, so the information on Alice is deleted. The position of subject is vacated and is assigned to John
  • 43. ht Ct Working of LSTMs First step in the LSTM is to decide which information to be omitted in from the cell in that particular time step. It is decided by the sigmoid function. It looks at the previous state (ht-1 ) and the current input xt and computes the function. +x tanh x tanh x ht xt Ct-1 ht-1 ft ft = forget gate Decides which information to delete that is not important from previous time step Consider an LSTM is fed with the following inputs from previous and present time step : Alice is good in Physics. John on the other hand is good in Chemistry. John plays football well. He told me yesterday over phone that he had served as the captain of his college football team. Previous Output Current Input ht-1 xt Forget gate realizes there might be a change in context after encountering the first full stop. Compares with the current input sentence at xt The next sentence talks about John, so the information on Alice is deleted. The position of subject is vacated and is assigned to John
  • 44. ht Ct Working of LSTMs First step in the LSTM is to decide which information to be omitted in from the cell in that particular time step. It is decided by the sigmoid function. It looks at the previous state (ht-1 ) and the current input xt and computes the function. +x tanh x tanh x ht xt Ct-1 ht-1 ft ft = forget gate Decides which information to delete that is not important from previous time step Consider an LSTM is fed with the following inputs from previous and present time step : Alice is good in Physics. John on the other hand is good in Chemistry. John plays football well. He told me yesterday over phone that he had served as the captain of his college football team. Previous Output Current Input ht-1 xt Forget gate realizes there might be a change in context after encountering the first full stop. Compares with the current input sentence at xt The next sentence talks about John, so the information on Alice is deleted. The position of subject is vacated and is assigned to John
  • 45. ht Ct Working of LSTMs In the second layer, there are 2 parts. One is the sigmoid function and the other is the tanh. In the sigmoid function, it decides which values to let through(0 or 1). tanh function gives the weightage to the values which are passed deciding their level of importance(-1 to 1). +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ Decides how much should this unit add to the current stateStep-2 it = input gate Determines which information to let through based on its significance in the current time step
  • 46. ht Ct Working of LSTMs +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ Consider the current input at xt John plays football well. He told me yesterday over phone that he had served as the captain of his college football team. Current Input Input gate analyses the important information
  • 47. ht Ct Working of LSTMs +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ Consider the current input at xt John plays football well. He told me yesterday over phone that he had served as the captain of his college football team. Current Input John plays football and he was the captain of his college team is important
  • 48. ht Ct Working of LSTMs +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ Consider the current input at xt John plays football well. He told me yesterday over phone that he had served as the captain of his college football team. Current Input He told me over phone yesterday is less important, hence it is forgotten
  • 49. ht Ct Working of LSTMs +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ Consider the current input at xt John plays football well. He told me yesterday over phone that he had served as the captain of his college football team. Current Input This process of adding some new information can be done via the input gate
  • 50. Working of LSTMs The third step is to decide what will be our output. First, we run a sigmoid layer which decides what parts of the cell state make it to the output. Then, we put the cell state through tanh to push the values to be between -1 and 1 and multiply it by the output of the sigmoid gate. +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ ht ot Ct Decides what part of the current cell state makes it to the outputStep-3 ot = output gate Allows the passed in information to impact the output in the current time step
  • 51. Working of LSTMs +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ ht ot Ct Let’s consider this example to predicting the next word in the sentence: John played tremendously well against the opponent and won for his team. For his contributions, brave ____ was awarded player of the match. There could be a lot of choices for the empty space
  • 52. Working of LSTMs +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ ht ot Ct Let’s consider this example to predicting the next word in the sentence: John played tremendously well against the opponent and won for his team. For his contributions, brave ____ was awarded player of the match. Current input brave is an adjective
  • 53. Working of LSTMs +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ ht ot Ct Let’s consider this example to predicting the next word in the sentence: John played tremendously well against the opponent and won for his team. For his contributions, brave ____ was awarded player of the match. Adjectives describe a noun
  • 54. Working of LSTMs +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ ht ot Ct Let’s consider this example to predicting the next word in the sentence: John played tremendously well against the opponent and won for his team. For his contributions, brave ____ was awarded player of the match.John “John” could be the best output after brave
  • 55. Use case implementation of LSTM Let’s predict the prices of stocks using LSTM network Based on the stock price data between 2012- 2016 Predict the stock prices of 2017
  • 56. Use case implementation of LSTM 1. Import the Libraries 2. Import the training dataset 3. Feature Scaling
  • 57. Use case implementation of LSTM 4. Create a data structure with 60 timesteps and 1 output 5. Import keras libraries and packages
  • 58. Use case implementation of LSTM 6. Initialize the RNN 7. Adding the LSTM layers and some Dropout regularization
  • 59. Use case implementation of LSTM 8. Adding the output layer 9. Compile the RNN 10. Fit the RNN to the training set
  • 60. Use case implementation of LSTM 11. Load the stock price test data for 2017 12. Get the predicted stock price of 2017
  • 61. Use case implementation of LSTM 13. Visualize the results of predicted and real stock price

Editor's Notes