SlideShare a Scribd company logo
Day 1 Lecture 4
Backward Propagation
Elisa Sayrol
[course site]
Learning
Purely Supervised
Typically Backpropagation + Stochastic Gradient Descent (SGD)
Good when there are lots of labeled data
Layer-wise Unsupervised + Supervised classifier
Train each layer in sequence, using regularized auto-encoders or Restricted Boltzmann
Machines (RBM)
Hold the feature extractor, on top train linear classifier on features
Good when labeled data is scarce but there are lots of unlabeled data
Layer-wise Unsupervised + Supervised Backprop
Train each layer in sequence
Backprop through the whole system
Good when learning problem is very difficult
Slide Credit: Lecun 2
From Lecture 3
L Hidden Layers
Hidden pre-activation (k>0)
Hidden activation (k=1,…L)
Output activation (k=L+1)
Figure Credit: Hugo Laroche NN course 3
Backpropagation algorithm
The output of the Network gives class scores that depens on the input
and the parameters
• Define a loss function that quantifies our unhappiness with the
scores across the training data.
• Come up with a way of efficiently finding the parameters that
minimize the loss function (optimization)
4
Probability Class given an input
(softmax)
Minimize the loss (plus some
regularization term) w.r.t. Parameters
over the whole training set.
Loss function; e.g., negative log-
likelihood (good for classification)
h2
h3
a3
a4 h4
Loss
Hidden Hidden Output
W2
W3
x a2
Input
W1
Regularization term (L2 Norm)
aka as weight decay
Figure Credit: Kevin McGuiness
Forward Pass
5
Backpropagation algorithm
• We need a way to fit the model to data: find parameters (W(k)
, b(k)
) of the
network that (locally) minimize the loss function.
• We can use stochastic gradient descent. Or better yet, mini-batch
stochastic gradient descent.
• To do this, we need to find the gradient of the loss function with respect to
all the parameters of the model (W(k)
, b(k)
)
• These can be found using the chain rule of differentiation.
• The calculations reveal that the gradient wrt. the parameters in layer k only
depends on the error from the above layer and the output from the layer
below.
• This means that the gradients for each layer can be computed iteratively,
starting at the last layer and propagating the error back through the network.
This is known as the backpropagation algorithm.
Slide Credit: Kevin McGuiness 6
1. Find the error in the top layer: 3. Backpropagate error to layer below2. Compute weight updates
h2
h3
a3
a4 h4
Loss
Hidden Hidden Output
W2
W3
x a2
Input
W1
L
Figure Credit: Kevin McGuiness
Backward Pass
7
Optimization
Stochastic Gradient Descent
Stochastic Gradient Descent with momentum
Stochastic Gradient Descent with L2 regularization
http://cs231n.github.io/optimization-1/
http://cs231n.github.io/optimization-2/
: learning rate
: weight decay
Recommended lectures:
8

More Related Content

What's hot (20)

PDF
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
PDF
Joint unsupervised learning of deep representations and image clusters
Universitat Politècnica de Catalunya
 
PDF
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
PDF
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Universitat Politècnica de Catalunya
 
PDF
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
PDF
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
Universitat Politècnica de Catalunya
 
PDF
Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...
Universitat Politècnica de Catalunya
 
PDF
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Universitat Politècnica de Catalunya
 
PDF
Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
PDF
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
PDF
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
PPTX
Visual Object Analysis using Regions and Local Features
Universitat Politècnica de Catalunya
 
PDF
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
PDF
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Universitat Politècnica de Catalunya
 
PDF
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
NamHyuk Ahn
 
PDF
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
PPTX
Object detection - RCNNs vs Retinanet
Rishabh Indoria
 
PDF
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
PPTX
Image Classification using deep learning
Asma-AH
 
PDF
Image Classification with Deep Learning | DevFest + GDay, George Town, Mala...
Virot "Ta" Chiraphadhanakul
 
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
Joint unsupervised learning of deep representations and image clusters
Universitat Politècnica de Catalunya
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Universitat Politècnica de Catalunya
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
Universitat Politècnica de Catalunya
 
Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...
Universitat Politècnica de Catalunya
 
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Universitat Politècnica de Catalunya
 
Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Visual Object Analysis using Regions and Local Features
Universitat Politècnica de Catalunya
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Universitat Politècnica de Catalunya
 
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
NamHyuk Ahn
 
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Object detection - RCNNs vs Retinanet
Rishabh Indoria
 
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
Image Classification using deep learning
Asma-AH
 
Image Classification with Deep Learning | DevFest + GDay, George Town, Mala...
Virot "Ta" Chiraphadhanakul
 

Similar to Deep Learning for Computer Vision: Backward Propagation (UPC 2016) (20)

PDF
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
PPTX
22PCOAM16_UNIT 2_ Session 12 Deriving Back-Propagation .pptx
Guru Nanak Technical Institutions
 
PDF
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Universitat Politècnica de Catalunya
 
PPTX
Classification_by_back_&propagation.pptx
SadiaSaleem301
 
PPTX
Back Propagation-11-11-2qwasdddddd024.pptx
vinodkumarthatipamul
 
PDF
Classification by back propagation, multi layered feed forward neural network...
bihira aggrey
 
PPTX
Backpropagation algo
noT yeT woRkiNg !! iM stiLl stUdYinG !!
 
PDF
NPTEL_backprobagation_Lecture4_DL(1).pdf
naveenraghavendran10
 
PPTX
Training Neural Networks.pptx
ksghuge
 
PPT
Back propagation
DrBaljitSinghKhehra
 
PPTX
Deep neural networks & computational graphs
Revanth Kumar
 
PPTX
PRML Chapter 5
Sunwoo Kim
 
PPTX
ML_ Unit 2_Part_B
Srimatre K
 
PPTX
back propagation1_presenation_lab 6.pptx
someyamohsen2
 
PPT
this is a Ai topic neural network ML_Lecture_4.ppt
ry54321288
 
PPTX
DeepLearningLecture.pptx
ssuserf07225
 
PPT
nural network ER. Abhishek k. upadhyay
abhishek upadhyay
 
PPTX
Deep learning crash course
Vishwas N
 
PPTX
Maxhine learning rec02 - MLP and BP.pptx
Toyba2
 
PPTX
This is about session rec02 - MLP and BP.pptx
Toyba2
 
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
22PCOAM16_UNIT 2_ Session 12 Deriving Back-Propagation .pptx
Guru Nanak Technical Institutions
 
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Universitat Politècnica de Catalunya
 
Classification_by_back_&propagation.pptx
SadiaSaleem301
 
Back Propagation-11-11-2qwasdddddd024.pptx
vinodkumarthatipamul
 
Classification by back propagation, multi layered feed forward neural network...
bihira aggrey
 
NPTEL_backprobagation_Lecture4_DL(1).pdf
naveenraghavendran10
 
Training Neural Networks.pptx
ksghuge
 
Back propagation
DrBaljitSinghKhehra
 
Deep neural networks & computational graphs
Revanth Kumar
 
PRML Chapter 5
Sunwoo Kim
 
ML_ Unit 2_Part_B
Srimatre K
 
back propagation1_presenation_lab 6.pptx
someyamohsen2
 
this is a Ai topic neural network ML_Lecture_4.ppt
ry54321288
 
DeepLearningLecture.pptx
ssuserf07225
 
nural network ER. Abhishek k. upadhyay
abhishek upadhyay
 
Deep learning crash course
Vishwas N
 
Maxhine learning rec02 - MLP and BP.pptx
Toyba2
 
This is about session rec02 - MLP and BP.pptx
Toyba2
 
Ad

More from Universitat Politècnica de Catalunya (20)

PDF
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Universitat Politècnica de Catalunya
 
PDF
Deep Generative Learning for All
Universitat Politècnica de Catalunya
 
PDF
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
Universitat Politècnica de Catalunya
 
PDF
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Universitat Politècnica de Catalunya
 
PDF
The Transformer - Xavier Giró - UPC Barcelona 2021
Universitat Politècnica de Catalunya
 
PDF
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Universitat Politècnica de Catalunya
 
PDF
Open challenges in sign language translation and production
Universitat Politècnica de Catalunya
 
PPTX
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Universitat Politècnica de Catalunya
 
PPTX
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Universitat Politècnica de Catalunya
 
PDF
Learn2Sign : Sign language recognition and translation using human keypoint e...
Universitat Politècnica de Catalunya
 
PDF
Intepretability / Explainable AI for Deep Neural Networks
Universitat Politècnica de Catalunya
 
PDF
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
PDF
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Universitat Politècnica de Catalunya
 
PDF
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
PDF
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Universitat Politècnica de Catalunya
 
PDF
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Universitat Politècnica de Catalunya
 
PDF
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Universitat Politècnica de Catalunya
 
PDF
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Universitat Politècnica de Catalunya
 
PDF
Curriculum Learning for Recurrent Video Object Segmentation
Universitat Politècnica de Catalunya
 
PDF
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Universitat Politècnica de Catalunya
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Universitat Politècnica de Catalunya
 
Deep Generative Learning for All
Universitat Politècnica de Catalunya
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Universitat Politècnica de Catalunya
 
The Transformer - Xavier Giró - UPC Barcelona 2021
Universitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Universitat Politècnica de Catalunya
 
Open challenges in sign language translation and production
Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Universitat Politècnica de Catalunya
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Universitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Universitat Politècnica de Catalunya
 
Intepretability / Explainable AI for Deep Neural Networks
Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Universitat Politècnica de Catalunya
 
Curriculum Learning for Recurrent Video Object Segmentation
Universitat Politècnica de Catalunya
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Universitat Politècnica de Catalunya
 
Ad

Recently uploaded (20)

PDF
Biomechanics of Gait: Engineering Solutions for Rehabilitation (www.kiu.ac.ug)
publication11
 
PPTX
Server Side Web Development Unit 1 of Nodejs.pptx
sneha852132
 
PDF
Reasons for the succes of MENARD PRESSUREMETER.pdf
majdiamz
 
PPTX
Day2 B2 Best.pptx
helenjenefa1
 
PDF
Water Design_Manual_2005. KENYA FOR WASTER SUPPLY AND SEWERAGE
DancanNgutuku
 
PPT
PPT2_Metal formingMECHANICALENGINEEIRNG .ppt
Praveen Kumar
 
PPTX
artificial intelligence applications in Geomatics
NawrasShatnawi1
 
PPTX
Element 7. CHEMICAL AND BIOLOGICAL AGENT.pptx
merrandomohandas
 
PDF
Zilliz Cloud Demo for performance and scale
Zilliz
 
DOC
MRRS Strength and Durability of Concrete
CivilMythili
 
PPTX
UNIT DAA PPT cover all topics 2021 regulation
archu26
 
PDF
Book.pdf01_Intro.ppt algorithm for preperation stu used
archu26
 
PPTX
美国电子版毕业证南卡罗莱纳大学上州分校水印成绩单USC学费发票定做学位证书编号怎么查
Taqyea
 
PDF
Ethics and Trustworthy AI in Healthcare – Governing Sensitive Data, Profiling...
AlqualsaDIResearchGr
 
PPTX
GitOps_Without_K8s_Training_detailed git repository
DanialHabibi2
 
PDF
GTU Civil Engineering All Semester Syllabus.pdf
Vimal Bhojani
 
PPTX
Green Building & Energy Conservation ppt
Sagar Sarangi
 
DOCX
CS-802 (A) BDH Lab manual IPS Academy Indore
thegodhimself05
 
PPTX
Snet+Pro+Service+Software_SNET+Pro+2+Instructions.pptx
jenilsatikuvar1
 
PPTX
Hashing Introduction , hash functions and techniques
sailajam21
 
Biomechanics of Gait: Engineering Solutions for Rehabilitation (www.kiu.ac.ug)
publication11
 
Server Side Web Development Unit 1 of Nodejs.pptx
sneha852132
 
Reasons for the succes of MENARD PRESSUREMETER.pdf
majdiamz
 
Day2 B2 Best.pptx
helenjenefa1
 
Water Design_Manual_2005. KENYA FOR WASTER SUPPLY AND SEWERAGE
DancanNgutuku
 
PPT2_Metal formingMECHANICALENGINEEIRNG .ppt
Praveen Kumar
 
artificial intelligence applications in Geomatics
NawrasShatnawi1
 
Element 7. CHEMICAL AND BIOLOGICAL AGENT.pptx
merrandomohandas
 
Zilliz Cloud Demo for performance and scale
Zilliz
 
MRRS Strength and Durability of Concrete
CivilMythili
 
UNIT DAA PPT cover all topics 2021 regulation
archu26
 
Book.pdf01_Intro.ppt algorithm for preperation stu used
archu26
 
美国电子版毕业证南卡罗莱纳大学上州分校水印成绩单USC学费发票定做学位证书编号怎么查
Taqyea
 
Ethics and Trustworthy AI in Healthcare – Governing Sensitive Data, Profiling...
AlqualsaDIResearchGr
 
GitOps_Without_K8s_Training_detailed git repository
DanialHabibi2
 
GTU Civil Engineering All Semester Syllabus.pdf
Vimal Bhojani
 
Green Building & Energy Conservation ppt
Sagar Sarangi
 
CS-802 (A) BDH Lab manual IPS Academy Indore
thegodhimself05
 
Snet+Pro+Service+Software_SNET+Pro+2+Instructions.pptx
jenilsatikuvar1
 
Hashing Introduction , hash functions and techniques
sailajam21
 

Deep Learning for Computer Vision: Backward Propagation (UPC 2016)

  • 1. Day 1 Lecture 4 Backward Propagation Elisa Sayrol [course site]
  • 2. Learning Purely Supervised Typically Backpropagation + Stochastic Gradient Descent (SGD) Good when there are lots of labeled data Layer-wise Unsupervised + Supervised classifier Train each layer in sequence, using regularized auto-encoders or Restricted Boltzmann Machines (RBM) Hold the feature extractor, on top train linear classifier on features Good when labeled data is scarce but there are lots of unlabeled data Layer-wise Unsupervised + Supervised Backprop Train each layer in sequence Backprop through the whole system Good when learning problem is very difficult Slide Credit: Lecun 2
  • 3. From Lecture 3 L Hidden Layers Hidden pre-activation (k>0) Hidden activation (k=1,…L) Output activation (k=L+1) Figure Credit: Hugo Laroche NN course 3
  • 4. Backpropagation algorithm The output of the Network gives class scores that depens on the input and the parameters • Define a loss function that quantifies our unhappiness with the scores across the training data. • Come up with a way of efficiently finding the parameters that minimize the loss function (optimization) 4
  • 5. Probability Class given an input (softmax) Minimize the loss (plus some regularization term) w.r.t. Parameters over the whole training set. Loss function; e.g., negative log- likelihood (good for classification) h2 h3 a3 a4 h4 Loss Hidden Hidden Output W2 W3 x a2 Input W1 Regularization term (L2 Norm) aka as weight decay Figure Credit: Kevin McGuiness Forward Pass 5
  • 6. Backpropagation algorithm • We need a way to fit the model to data: find parameters (W(k) , b(k) ) of the network that (locally) minimize the loss function. • We can use stochastic gradient descent. Or better yet, mini-batch stochastic gradient descent. • To do this, we need to find the gradient of the loss function with respect to all the parameters of the model (W(k) , b(k) ) • These can be found using the chain rule of differentiation. • The calculations reveal that the gradient wrt. the parameters in layer k only depends on the error from the above layer and the output from the layer below. • This means that the gradients for each layer can be computed iteratively, starting at the last layer and propagating the error back through the network. This is known as the backpropagation algorithm. Slide Credit: Kevin McGuiness 6
  • 7. 1. Find the error in the top layer: 3. Backpropagate error to layer below2. Compute weight updates h2 h3 a3 a4 h4 Loss Hidden Hidden Output W2 W3 x a2 Input W1 L Figure Credit: Kevin McGuiness Backward Pass 7
  • 8. Optimization Stochastic Gradient Descent Stochastic Gradient Descent with momentum Stochastic Gradient Descent with L2 regularization http://cs231n.github.io/optimization-1/ http://cs231n.github.io/optimization-2/ : learning rate : weight decay Recommended lectures: 8