SlideShare a Scribd company logo
Visualizing and Understanding
Convolutional Networks
신우철
Introduction
• Renewed interest in CNN: 1) availability of much larger training sets, 2)
powerful GPU implementations, 3) better model regularization strategies.
• Visualization technique that reveals the input stimuli that excite individual
feature maps, and evolution of features during training
1) Multi-layered Deconvnet: project feature activations back to the input
pixel space.
2) Sensitivity analysis of the classifier output: occluding portions of the input
image, revealing which parts of the scene are important.
Approach
1) Visualization with Deconvnet
• Deconvnet uses same components of a convnet model, such as filtering or
pooling, but in reverse.
• This way the feature activities in intermediate layers can be mapped back to
the input pixel space, showing what input pattern caused a given activation
in the feature maps.
Approach
(1) Present input image to the convnet and compute features throughout the
layers.
(2) Set all other activations in the layer to zero and pass the feature maps as
input to the attached deconvnet layer
(3) (i) unpool, (ii) rectify, (iii) filter to reconstruct the activity in the layer
beneath.
Approach
(i) Unpooling
• Record the locations of the maxima within each pooling region in a set of
switch variables, preserving the structure of the stimulus.
(ii) Rectification
• Pass reconstructed signal through a ReLU non-linearity
(iii) Filtering
• Use transposed versions of the same filters
Approach
(i) Unpooling
Approach
(iii) Filtering
Training Details
• Used AlexNet but replaced sparse connection of 2 GPUs with dense
connections.
• Few factors modified based on visualization results.
Convnet Visualization
(1) Feature Visualization
• Projecting each feature map down to pixel space reveals the different
structures that excite a given feature map. This shows greater invariance
than corresponding image patches, as the visualization solely focuses on the
discriminant structure within each patch.
(i) The strong grouping within each feature map
(ii) Greater invariance at higher layers
(iii) Exaggeration of discriminative parts of the image
Convnet Visualization
(2) Feature Evolution during Training
• Lower layers of the model can be seen to converge within a few epochs,
while upper layers only develop after a considerable number of epochs.
• Sudden jumps in appearance results from the image from which the
strongest activation originates.
* We can confirm our intuition about required time for convergence by level
of layers through visualization. Moreover, we can use visualization for tuning
hyperparameters.
Convnet Visualization
(3) Feature Invariance
• Small transformations have a dramatic effect in the first layer of the model,
but a lesser impact at the top feature layer for translation & scaling.
• The output is not invariant to rotation except for object with rotational
symmetry.
Convnet Visualization
1. Architecture Selection
• Visualization assisted selecting good architectures
(i) Reduced 1st layer filter size from 11 x 11 to 7 x 7
(ii) Made the stride of the convolution, 2 rather than 4
Convnet Visualization
2. Occlusion Sensitivity
• The probability of the correct class drops significantly when the object is
occluded.
• The model is identifying the location of the object in the image rather than
using the surrounding context.
Convnet Visualization
3. Correspondence Analysis
• Deep models implicitly compute correspondence between specific object
parts in images.
Experiments
• Overall depth of the model(convolutional layer) is important for
performance, while increasing both size of convolutional layers and fully-
connected layers might result in overfitting.

More Related Content

What's hot (20)

PDF
Mask R-CNN
Chanuk Lim
 
PPTX
Depth estimation using deep learning
University of Oklahoma
 
PDF
CPlaNet: Enhancing Image Geolocalization by Combinatorial Partitioning of Maps
NAVER Engineering
 
PPTX
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Sunando Sengupta
 
PPTX
Thesis Presentation
Reuben Feinman
 
PDF
FastCampus 2018 SLAM Workshop
Dong-Won Shin
 
PDF
Practical implementation of pca on satellite images
Bhanu Pratap
 
PPTX
Convolutional Patch Representations for Image Retrieval An unsupervised approach
Universitat de Barcelona
 
PPTX
Yol ov2
Bang Tsui Liou
 
PDF
3D reconstruction
Jorge Leandro, Ph.D.
 
PPTX
Implementing a modern, RenderMan compliant, REYES renderer
Davide Pasca
 
PPTX
Geometry Batching Using Texture-Arrays
Matthias Trapp
 
PDF
imageCorrectionLinearDiffusion
Kellen Betts
 
PPTX
Neural Scene Representation & Rendering: Introduction to Novel View Synthesis
Vincent Sitzmann
 
PDF
VJAI Paper Reading#3-KDD2019-ClusterGCN
Dat Nguyen
 
PDF
Objects as points (CenterNet) review [CDM]
Dongmin Choi
 
PPT
Remote Sensing Lec 10
polylsgiedx
 
PDF
paper
Vincent Kee
 
PDF
Parallel implementation of geodesic distance transform with application in su...
Tuan Q. Pham
 
PPTX
Reyes
Dragan Okanovic
 
Mask R-CNN
Chanuk Lim
 
Depth estimation using deep learning
University of Oklahoma
 
CPlaNet: Enhancing Image Geolocalization by Combinatorial Partitioning of Maps
NAVER Engineering
 
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Sunando Sengupta
 
Thesis Presentation
Reuben Feinman
 
FastCampus 2018 SLAM Workshop
Dong-Won Shin
 
Practical implementation of pca on satellite images
Bhanu Pratap
 
Convolutional Patch Representations for Image Retrieval An unsupervised approach
Universitat de Barcelona
 
3D reconstruction
Jorge Leandro, Ph.D.
 
Implementing a modern, RenderMan compliant, REYES renderer
Davide Pasca
 
Geometry Batching Using Texture-Arrays
Matthias Trapp
 
imageCorrectionLinearDiffusion
Kellen Betts
 
Neural Scene Representation & Rendering: Introduction to Novel View Synthesis
Vincent Sitzmann
 
VJAI Paper Reading#3-KDD2019-ClusterGCN
Dat Nguyen
 
Objects as points (CenterNet) review [CDM]
Dongmin Choi
 
Remote Sensing Lec 10
polylsgiedx
 
Parallel implementation of geodesic distance transform with application in su...
Tuan Q. Pham
 

Similar to Visualizing and understanding convolutional networks(2014) (20)

PDF
20150703.journal club
Hayaru SHOUNO
 
PDF
Modern Convolutional Neural Network techniques for image segmentation
Gioele Ciaparrone
 
PPTX
adlkchiuabcndjhvkajnfdkjhcfatgcbajkbcyudfctauygb
sarafdarsarthaki
 
PDF
Visualizing and Understanding Convolutional Networks
Willy Marroquin (WillyDevNET)
 
PPTX
Introduction to Convolutional Neural Networks (CNNs).pptx
CHRISEVANS269099
 
PPTX
04 Deep CNN (Ch_01 to Ch_3).pptx
ZainULABIDIN496386
 
PPTX
Introduction to Convolutional Neural Networks (CNNs).pptx
CHRISEVANS269099
 
PPTX
intro-to-cnn-April_2020.pptx
ssuser3aa461
 
PPTX
Introduction to convolutional networks .pptx
ArunNegi37
 
PPTX
Computer Vision.pptx
GDSCIIITDHARWAD
 
PDF
cnn.pdf
Amnaalia
 
PPTX
Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...
changedaeoh
 
PPTX
DL-CO2-Session6-VGGNet_GoogLeNet_ResNet_DenseNet_RCNN.pptx
Kv Sagar
 
PDF
Convolutional Neural Networks (CNN)
Gaurav Mittal
 
PPTX
Computer vision-nit-silchar-hackathon
Aditya Bhattacharya
 
PPTX
build a Convolutional Neural Network (CNN) using TensorFlow in Python
Kv Sagar
 
PPTX
Handwritten Digit Recognition and performance of various modelsation[autosaved]
SubhradeepMaji
 
PPTX
U-Netpresentation.pptx
NoorUlHaq47
 
PPTX
Convolutional neural networks 이론과 응용
홍배 김
 
20150703.journal club
Hayaru SHOUNO
 
Modern Convolutional Neural Network techniques for image segmentation
Gioele Ciaparrone
 
adlkchiuabcndjhvkajnfdkjhcfatgcbajkbcyudfctauygb
sarafdarsarthaki
 
Visualizing and Understanding Convolutional Networks
Willy Marroquin (WillyDevNET)
 
Introduction to Convolutional Neural Networks (CNNs).pptx
CHRISEVANS269099
 
04 Deep CNN (Ch_01 to Ch_3).pptx
ZainULABIDIN496386
 
Introduction to Convolutional Neural Networks (CNNs).pptx
CHRISEVANS269099
 
intro-to-cnn-April_2020.pptx
ssuser3aa461
 
Introduction to convolutional networks .pptx
ArunNegi37
 
Computer Vision.pptx
GDSCIIITDHARWAD
 
cnn.pdf
Amnaalia
 
Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...
changedaeoh
 
DL-CO2-Session6-VGGNet_GoogLeNet_ResNet_DenseNet_RCNN.pptx
Kv Sagar
 
Convolutional Neural Networks (CNN)
Gaurav Mittal
 
Computer vision-nit-silchar-hackathon
Aditya Bhattacharya
 
build a Convolutional Neural Network (CNN) using TensorFlow in Python
Kv Sagar
 
Handwritten Digit Recognition and performance of various modelsation[autosaved]
SubhradeepMaji
 
U-Netpresentation.pptx
NoorUlHaq47
 
Convolutional neural networks 이론과 응용
홍배 김
 
Ad

Recently uploaded (20)

PDF
BioSensors glucose monitoring, cholestrol
nabeehasahar1
 
PPTX
Introduction to Neural Networks and Perceptron Learning Algorithm.pptx
Kayalvizhi A
 
PPTX
Thermal runway and thermal stability.pptx
godow93766
 
PDF
Zilliz Cloud Demo for performance and scale
Zilliz
 
PPTX
UNIT DAA PPT cover all topics 2021 regulation
archu26
 
PPT
Oxygen Co2 Transport in the Lungs(Exchange og gases)
SUNDERLINSHIBUD
 
PDF
POWER PLANT ENGINEERING (R17A0326).pdf..
haneefachosa123
 
PDF
International Journal of Information Technology Convergence and services (IJI...
ijitcsjournal4
 
PPTX
NEUROMOROPHIC nu iajwojeieheueueueu.pptx
knkoodalingam39
 
PDF
ARC--BUILDING-UTILITIES-2-PART-2 (1).pdf
IzzyBaniquedBusto
 
PDF
MOBILE AND WEB BASED REMOTE BUSINESS MONITORING SYSTEM
ijait
 
PDF
Statistical Data Analysis Using SPSS Software
shrikrishna kesharwani
 
PPTX
Benefits_^0_Challigi😙🏡💐8fenges[1].pptx
akghostmaker
 
PPTX
MPMC_Module-2 xxxxxxxxxxxxxxxxxxxxx.pptx
ShivanshVaidya5
 
PPTX
Types of Bearing_Specifications_PPT.pptx
PranjulAgrahariAkash
 
PDF
PORTFOLIO Golam Kibria Khan — architect with a passion for thoughtful design...
MasumKhan59
 
PDF
Book.pdf01_Intro.ppt algorithm for preperation stu used
archu26
 
PPTX
Presentation on Foundation Design for Civil Engineers.pptx
KamalKhan563106
 
PDF
A presentation on the Urban Heat Island Effect
studyfor7hrs
 
PPTX
drones for disaster prevention response.pptx
NawrasShatnawi1
 
BioSensors glucose monitoring, cholestrol
nabeehasahar1
 
Introduction to Neural Networks and Perceptron Learning Algorithm.pptx
Kayalvizhi A
 
Thermal runway and thermal stability.pptx
godow93766
 
Zilliz Cloud Demo for performance and scale
Zilliz
 
UNIT DAA PPT cover all topics 2021 regulation
archu26
 
Oxygen Co2 Transport in the Lungs(Exchange og gases)
SUNDERLINSHIBUD
 
POWER PLANT ENGINEERING (R17A0326).pdf..
haneefachosa123
 
International Journal of Information Technology Convergence and services (IJI...
ijitcsjournal4
 
NEUROMOROPHIC nu iajwojeieheueueueu.pptx
knkoodalingam39
 
ARC--BUILDING-UTILITIES-2-PART-2 (1).pdf
IzzyBaniquedBusto
 
MOBILE AND WEB BASED REMOTE BUSINESS MONITORING SYSTEM
ijait
 
Statistical Data Analysis Using SPSS Software
shrikrishna kesharwani
 
Benefits_^0_Challigi😙🏡💐8fenges[1].pptx
akghostmaker
 
MPMC_Module-2 xxxxxxxxxxxxxxxxxxxxx.pptx
ShivanshVaidya5
 
Types of Bearing_Specifications_PPT.pptx
PranjulAgrahariAkash
 
PORTFOLIO Golam Kibria Khan — architect with a passion for thoughtful design...
MasumKhan59
 
Book.pdf01_Intro.ppt algorithm for preperation stu used
archu26
 
Presentation on Foundation Design for Civil Engineers.pptx
KamalKhan563106
 
A presentation on the Urban Heat Island Effect
studyfor7hrs
 
drones for disaster prevention response.pptx
NawrasShatnawi1
 
Ad

Visualizing and understanding convolutional networks(2014)

  • 2. Introduction • Renewed interest in CNN: 1) availability of much larger training sets, 2) powerful GPU implementations, 3) better model regularization strategies. • Visualization technique that reveals the input stimuli that excite individual feature maps, and evolution of features during training 1) Multi-layered Deconvnet: project feature activations back to the input pixel space. 2) Sensitivity analysis of the classifier output: occluding portions of the input image, revealing which parts of the scene are important.
  • 3. Approach 1) Visualization with Deconvnet • Deconvnet uses same components of a convnet model, such as filtering or pooling, but in reverse. • This way the feature activities in intermediate layers can be mapped back to the input pixel space, showing what input pattern caused a given activation in the feature maps.
  • 4. Approach (1) Present input image to the convnet and compute features throughout the layers. (2) Set all other activations in the layer to zero and pass the feature maps as input to the attached deconvnet layer (3) (i) unpool, (ii) rectify, (iii) filter to reconstruct the activity in the layer beneath.
  • 5. Approach (i) Unpooling • Record the locations of the maxima within each pooling region in a set of switch variables, preserving the structure of the stimulus. (ii) Rectification • Pass reconstructed signal through a ReLU non-linearity (iii) Filtering • Use transposed versions of the same filters
  • 8. Training Details • Used AlexNet but replaced sparse connection of 2 GPUs with dense connections. • Few factors modified based on visualization results.
  • 9. Convnet Visualization (1) Feature Visualization • Projecting each feature map down to pixel space reveals the different structures that excite a given feature map. This shows greater invariance than corresponding image patches, as the visualization solely focuses on the discriminant structure within each patch. (i) The strong grouping within each feature map (ii) Greater invariance at higher layers (iii) Exaggeration of discriminative parts of the image
  • 10. Convnet Visualization (2) Feature Evolution during Training • Lower layers of the model can be seen to converge within a few epochs, while upper layers only develop after a considerable number of epochs. • Sudden jumps in appearance results from the image from which the strongest activation originates. * We can confirm our intuition about required time for convergence by level of layers through visualization. Moreover, we can use visualization for tuning hyperparameters.
  • 11. Convnet Visualization (3) Feature Invariance • Small transformations have a dramatic effect in the first layer of the model, but a lesser impact at the top feature layer for translation & scaling. • The output is not invariant to rotation except for object with rotational symmetry.
  • 12. Convnet Visualization 1. Architecture Selection • Visualization assisted selecting good architectures (i) Reduced 1st layer filter size from 11 x 11 to 7 x 7 (ii) Made the stride of the convolution, 2 rather than 4
  • 13. Convnet Visualization 2. Occlusion Sensitivity • The probability of the correct class drops significantly when the object is occluded. • The model is identifying the location of the object in the image rather than using the surrounding context.
  • 14. Convnet Visualization 3. Correspondence Analysis • Deep models implicitly compute correspondence between specific object parts in images.
  • 15. Experiments • Overall depth of the model(convolutional layer) is important for performance, while increasing both size of convolutional layers and fully- connected layers might result in overfitting.