Deep learning introduction basic information

Introduction to Deep
Learning
Pabitra Mitra
Indian Institute of Technology Kharagpur
pabitra@cse.iitkgp.ac.in
NSM Workshop on Accelerated Data Science

Deep Learning
• Based on neural networks
• Uses deep architectures
• Very successful in many applications

Perceptron
Input
values
weights
Summing
function
Bias
b
Activation
function
Induced
Field
v
Output
y
x1
x2
xm
w2
wm
w1
 
 )
(


Neuron Models
●The choice of activation function determines the
neuron model.
Examples:
●step function:
●ramp function:
●sigmoid function with z,x,y parameters
●Gaussian function:














 


2
2
1
exp
2
1
)
(





v
v
)
exp(
1
1
)
(
y
xv
z
v


















otherwise
))
/(
)
)(
((
if
if
)
(
c
d
a
b
c
v
a
d
v
b
c
v
a
v







c
v
b
c
v
a
v
if
if
)
(


Sigmoid unit
• f is the sigmoid function
• Derivative can be easily computed:
• Logistic equation
• used in many applications
• other functions possible (tanh)
• Single unit:
• apply gradient descent rule
• Multilayer networks: backpropagation
x1
x2
xn
w1
w2
wn
:
:
x01
w0



n
i
i
i x
w
0
net
S f )
net
(
f
o 
x
e
x
f 


1
1
)
(
 
)
(
1
)
(
)
(
x
f
x
f
dx
x
df


5

Multi layer feed-forward NN (FFNN)
● FFNN is a more general network architecture, where there are
hidden layers between input and output layers.
● Hidden nodes do not directly receive inputs nor send outputs to
the external environment.
● FFNNs overcome the limitation of single-layer NN.
● They can handle non-linearly separable learning tasks.
Input
layer
Output
layer
Hidden Layer
3-4-2 Network

Backpropagation
• Initialize all weights to small random numbers
• Repeat
For each training example
1. Input the training example to the network and compute the network outputs
2. For each output unit k
dk ← ok 1  ok tk  ok
3. For each hidden unit h
dh ← oh 1  oh Skoutputs wk,hdk
4. Update each network weight wj,i
wj,i ← wj,i + Dwj,i
where Dwj,i  h dj xj,i
7

●Data representation
●Network Topology
●Network Parameters
●Training
●Validation
NN DESIGN ISSUES

Expressiveness
• Every bounded continuous function can be
approximated with arbitrarily small error, by
network with one hidden layer (Cybenko et al ‘89)
• Hidden layer of sigmoid functions
• Output layer of linear functions
• Any function can be approximated to arbitrary
accuracy by a network with two hidden layers
(Cybenko ‘88)
• Sigmoid units in both hidden layers
• Output layer of linear functions
9

Choice of Architecture Neural Networks
• Training Set vs Generalization error

Motivation: Mimic the Brain Structure
Feature
Extraction
Learning
Decision
Input Signal
Mid/Low
Level
Neurons
Higher
Brain
Decision
Sensory
Neurons
Arranged
In Coupled
Layers
End-to-End
Neural
Architecture

Motivation
• Practical success in computer vision, signal processing, text mining
• Increase in volume and complexity of data
• Availability of GPUs

Convolutional Neural Network: Motivation

Deep learning introduction basic information

ResNet
CNN + Skip Connections
Pyramidal cells in cortex

Full ResNet architecture:
• Stack residual blocks
• Every residual block has two 3x3 conv layers
• Periodically, double # of filters and
downsample spatially using stride 2 (in each
dimension)
• Additional conv layer at the beginning
• No FC layers at the end (only FC 1000 to
output classes)

Challenges of Depth
• Overfitting – dropout
• Vanishing gradient – ReLU activation
• Accelerating training – batch normalization
• Hyperparameter tuning

Types of Deep Architectures
• RNN, LSTM (sequence learning)
• Stacked Autoencoders (representation learning)
• GAN (classification, distribution learning)
• Combining architectures – unified backprop if all layers differentiable
• Tensorflow, PyTorch

References
• Introduction to Deep Learning – Ian Goodfellow
• Stanford Deep Learning course

Deep learning introduction basic information

More Related Content

Similar to Deep learning introduction basic information (20)

More from AnoopCadlord1 (15)

Recently uploaded (20)

Deep learning introduction basic information