SlideShare a Scribd company logo
Application of deep learning to
computer vision
Presented by: Djamal Abide
Plan
1. Data science
2. Artificial intelligence
3. Computer vision
4. Deep Learning
5. Demo
March 23, 2017 Djamal Abide 2
Plan
1. Data science
2. Artificial intelligence
3. Computer vision
4. Deep Learning
5. Demo
March 23, 2017 Djamal Abide 3
Data Science Definition
It’s is an interdisciplinary field about
processes and systems to extract
knowledge or insights from data in
various forms, either structured or
unstructured
March 23, 2017 Djamal Abide 4
Examples of AI Applications
Type Examples
Monitoring
1. Detecting credit-card fraud
2. Cybersecurity intrusions
Discovering
1. Genetics
2. Causal models for air transport safety
Predicting
1. Netflix movies recommendation
2. Weather forecasting
Interpreting
1. Face detection (images)
2. Pedestrian detection (videos)
3. Speech recognition (audios)
March 23, 2017 Djamal Abide 5
Data
Science
Data
Engineering
Scientific
Method
Math
StatisticsAdvanced
Computing
Visualization
Hacker
Mindset
Data Science Team Skills Set
March 23, 2017 Djamal Abide 6
Ask Questions
Research &
Gather Data
Formulate
Hypothesis
Test Hypothesis
(Experiments)
Analyze Results
(Draw Conclusion)
Report Results
The Scientific Method
March 23, 2017 Djamal Abide 7
Plan
1. Data science
2. Artificial intelligence
3. Computer vision
4. Deep Learning
5. Demo
March 23, 2017 Djamal Abide 8
Artificial
Intelligence
Natural Language
Processing (NLP)
Computer Vision
Robotics
Problem-solving and
planning
Machine Learning
Knowledge
Representation
Artificial Intelligence Research Fields
March 23, 2017 Djamal Abide 9
Plan
1. Data science
2. Artificial intelligence
3. Computer vision
4. Deep Learning
5. Demo
March 23, 2017 Djamal Abide 10
What is Computer Vision?
It’s a field that includes methods for
acquiring, processing, analyzing and
understanding images from the real world
in order to produce information in the form
of decision.
Applications
• Recognize objects
• Locate objects in space
• Track objects
• Recognize actions
March 23, 2017 Djamal Abide 11
Computer
Vision
Optics
Machine
Learning
Digital Images
Processing
Computer Vision Components
March 23, 2017 Djamal Abide 12
Source: https://www.pro-therm.com/images/infrared_basics_figure2_large.gif
Radiation wavelengths
March 23, 2017 Djamal Abide 13
Colored Image Data Structure
Red, Green and Blue
values are between:
0 and 255
Intensity values are
between: 0 and 255
Gray Scaled Image Data
Structure
March 23, 2017 Djamal Abide 14
Image Processing Examples
Resized Gray Scale Edge Detection
March 23, 2017 Djamal Abide 15
Classical Program
x f(x) y
Machine Learning: f(x) function is
Learned from the data
Machine Learning vs Classical Program
Input Data
Program
Implementing f(x)
Result
(x1, y1)
(x2, y2) ...
ML
Algorithm
Model
f(x)
Training
Examples
Result
ML Program
To Learn f(x)
March 23, 2017 Djamal Abide 16
Prediction Evaluation
Prediction With Machine Learning Model
Model
f(x)
x
Prediction
Tool
Predicted
y
Predicted
y
Real
y
Comparison
Tool
Accuracy
March 23, 2017 Djamal Abide 17
March 23, 2017 Djamal Abide 18
Plan
1. Data science
2. Artificial intelligence
3. Computer vision
4. Deep Learning
5. Demo
March 23, 2017 Djamal Abide 19
Source: https://i.ytimg.com/vi/osa3zIEJjgw/maxresdefault.jpg
Human brain and
Artificial Neural
Networks
Human brain
doesn’t need
features
Activation
function
March 23, 2017 Djamal Abide 20
Source: https://nivdul.files.wordpress.com/2015/11/nivdul_deep_learning.png?w=700&h=367
Deep neural
networks learn
hierarchical feature
representations
March 23, 2017 Djamal Abide 21
Deep Learning Flow For Training Models
Input data Preprocessing
Enhanced
Clean Data
Features
Extraction
Features
Deep
Learning
Model
• Without clean data, Deep Learning cannot learn or discover patterns
Traditional Machine Learning Flow For Training Models
Input data Preprocessing
Enhanced
Clean Data
Features
Extraction
Features
(help in
finding
patterns)
Tradition ML
Algorithm
Model
• Clean data helps in engineering robust features
• Without good features, ML algorithm cannot learn or discover patterns
X X
March 23, 2017 Djamal Abide 22
Why it is hard to recognize objects?
• Segmentation: Picture contains many objects
• Lighting: Intensity of light
• Deformation: Handwriting with many styles
• Affordance: Objects labeled based on what they are used for.
Example: chairs
• Viewpoint: Picture could be taken from different angles
March 23, 2017 Djamal Abide 23
Convolutional layer
March 23, 2017 Djamal Abide 24
Pooling layer
March 23, 2017 Djamal Abide 25
LeNet: 1st successful CNN
March 23, 2017 Djamal Abide 26
Source: http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf
• MNIST has contains 70,000 of
pictures 9 different digits
• Format of a picture is 28 x 28
• Scientists use 60,000 pics to train
and 10,000 pics for testing
MNIST Database
March 23, 2017 Djamal Abide 27
Classifier Preprocessing Test Error Rate (%) Reference
Linear Classifiers
linear classifier (1-layer NN) none 12.0 LeCun et al. 1998
linear classifier (1-layer NN) deskewing 8.4 LeCun et al. 1998
pairwise linear classifier deskewing 7.6 LeCun et al. 1998
K-Nearest Neighbors
K-nearest-neighbors, Euclidean (L2) none 5.0 LeCun et al. 1998
…
K-NN, shape context matching shape context feature extraction 0.63 Belongie et al. IEEE PAMI 2002
Boosted Stumps
boosted stumps none 7.7 Kegl et al., ICML 2009
…
product of stumps on Haar features Haar features 0.87 Kegl et al., ICML 2009
Non-Linear Classifiers
40 PCA + quadratic classifier none 3.3 LeCun et al. 1998
1000 RBF + linear classifier none 3.6 LeCun et al. 1998
March 23, 2017 Djamal Abide 28
Classifier Preprocessing Test Error Rate (%) Reference
SVMs
SVM, Gaussian Kernel none 1.4
… … … …
Virtual SVM, deg-9 poly, 2-pixel jittered deskewing 0.56 DeCoste and Scholkopf, MLJ 2002
Neural Nets
2-layer NN, 300 hidden units, mean
square error
none 4.7 LeCun et al. 1998
…
6-layer NN 784-2500-2000-1500-1000-
500-10 (on GPU) [elastic distortions]
none 0.35
Ciresan et al. Neural Computation 10, 2010 and arXiv 1003.0358,
201
Convolutional nets
Convolutional net LeNet-1 subsampling to 16x16 pixels 1.7 LeCun et al. 1998
…
committee of 35 conv. net, 1-20-P-40-P-
150-10 [elastic distortions]
width normalization 0.23 Ciresan et al. CVPR 2012
Source: http://yann.lecun.com/exdb/mnist/
March 23, 2017 Djamal Abide 29
Deep Learning: GPU versus CPU
March 23, 2017 Djamal Abide 30
Source: http://www.nvidia.com/object/tesla-m40.html
Large Scale Visual Recognition
Challenge 2012 (ILSVRC2012)
• Number of images: ~ 14 million
• Number of categories: 1,000
• Team “SuperVision” formed by students of Professor
Geoffrey Hinton from University of Toronto Alex
Krizhevsky & Ilya Sutskever won ImageNet classification
challenge with a large margin
March 23, 2017 Djamal Abide 31
Pros
• Enable learning of features rather than
hand tuning
• Impressive performance gains in:
– Computer vision
– Speech recognition
– Some text analysis
• Potential for more impact
Cons
• Requires a lot of data for high accuracy
• Computationally really expensive
• Hard to tune:
– Choice of architecture
– Parameter types
– Hyper-parameters
– Learning algorithm
– …
Deep Learning: Pros & Cons
March 23, 2017 Djamal Abide 32
Advise
• Use segmented images as training set
• Use data augmentation technics
• Don’t be a ‘hero’ trying to create your own
Deep Neuronal Network (CNN) architecture,
use an existing one
• Use transfer learning (pre-trained models)
March 23, 2017 Djamal Abide 33
Plan
1. Data science
2. Artificial intelligence
3. Computer vision
4. Deep Learning
5. Demo
March 23, 2017 Djamal Abide 34
ConvNetJS
(Deep Learning in your browser)
• http://cs.stanford.edu/people/karpathy/convn
etjs/index.html
March 23, 2017 Djamal Abide 35
March 23, 2017 Djamal Abide 36

More Related Content

What's hot (20)

PDF
210523 swin transformer v1.5
taeseon ryu
 
PPTX
Yolo
Bang Tsui Liou
 
PDF
Gnn overview
Louis (Yufeng) Wang
 
PDF
ResNet basics (Deep Residual Network for Image Recognition)
Sanjay Saha
 
PPTX
Convolutional Neural Network (CNN) - image recognition
YUNG-KUEI CHEN
 
PDF
Object detection and Instance Segmentation
Hichem Felouat
 
PDF
How Powerful are Graph Networks?
IAMAl
 
PPTX
Semantic Segmentation Methods using Deep Learning
Sungjoon Choi
 
PDF
Resnet
ashwinjoseph95
 
PPTX
Introduction to Graph neural networks @ Vienna Deep Learning meetup
Liad Magen
 
PPT
Action Recognition (Thesis presentation)
nikhilus85
 
PPTX
Graph Neural Network - Introduction
Jungwon Kim
 
PDF
Object tracking final
MrsShwetaBanait1
 
PPTX
Unsupervised learning
amalalhait
 
PPTX
auto-assistance system for visually impaired person
shahsamkit73
 
PPTX
Image classification using CNN
Noura Hussein
 
PPTX
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
UMBC
 
PPTX
Deep learning for object detection
Wenjing Chen
 
PPTX
Resnet.pptx
YanhuaSi
 
PPTX
You only look once: Unified, real-time object detection (UPC Reading Group)
Universitat Politècnica de Catalunya
 
210523 swin transformer v1.5
taeseon ryu
 
Gnn overview
Louis (Yufeng) Wang
 
ResNet basics (Deep Residual Network for Image Recognition)
Sanjay Saha
 
Convolutional Neural Network (CNN) - image recognition
YUNG-KUEI CHEN
 
Object detection and Instance Segmentation
Hichem Felouat
 
How Powerful are Graph Networks?
IAMAl
 
Semantic Segmentation Methods using Deep Learning
Sungjoon Choi
 
Introduction to Graph neural networks @ Vienna Deep Learning meetup
Liad Magen
 
Action Recognition (Thesis presentation)
nikhilus85
 
Graph Neural Network - Introduction
Jungwon Kim
 
Object tracking final
MrsShwetaBanait1
 
Unsupervised learning
amalalhait
 
auto-assistance system for visually impaired person
shahsamkit73
 
Image classification using CNN
Noura Hussein
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
UMBC
 
Deep learning for object detection
Wenjing Chen
 
Resnet.pptx
YanhuaSi
 
You only look once: Unified, real-time object detection (UPC Reading Group)
Universitat Politècnica de Catalunya
 

Similar to Application of deep leaning to computer vision (20)

PDF
Deep learning: challenges and applications
Aboul Ella Hassanien
 
PDF
Automatic Attendance System using Deep Learning Framework
Pinaki Ranjan Sarkar
 
PDF
ML_Lec1.pdf
ssuserd19f3d1
 
PPTX
Data-centric AI and the convergence of data and model engineering: opportunit...
Paolo Missier
 
PDF
Introduction to the Artificial Intelligence and Computer Vision revolution
Darian Frajberg
 
PDF
IBM Cloud Paris Meetup 20180517 - Deep Learning Challenges
IBM France Lab
 
PDF
Application of gaussian filter with principal component analysis
IAEME Publication
 
PDF
Application of gaussian filter with principal component analysis
IAEME Publication
 
PPTX
Obscenity Detection in Images
Anil Kumar Gupta
 
PDF
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Universitat Politècnica de Catalunya
 
PPTX
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
Aalto University
 
PDF
ICCES 2017 - Crowd Density Estimation Method using Regression Analysis
Ahmed Gad
 
PDF
Raul sena - Apresentação Analiticsemtudo - Scientific Applications using GPU
Eduardo Gaspar
 
PDF
Image Maximization Using Multi Spectral Image Fusion Technique
dbpublications
 
PPT
2. visualization in data mining
Azad public school
 
PDF
Automating Software Development Using Artificial Intelligence (AI)
Jeremy Bradbury
 
PPTX
Object Detection using Deep Neural Networks
Usman Qayyum
 
PDF
Towards Glyph-based Visualizations for Big Data Clustering
VANDA - Visual Analytics Interfaces for Big Data Environments
 
PDF
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
Paolo Missier
 
PDF
Visual Analytics Interfaces for Big Data Environments
VANDA - Visual Analytics Interfaces for Big Data Environments
 
Deep learning: challenges and applications
Aboul Ella Hassanien
 
Automatic Attendance System using Deep Learning Framework
Pinaki Ranjan Sarkar
 
ML_Lec1.pdf
ssuserd19f3d1
 
Data-centric AI and the convergence of data and model engineering: opportunit...
Paolo Missier
 
Introduction to the Artificial Intelligence and Computer Vision revolution
Darian Frajberg
 
IBM Cloud Paris Meetup 20180517 - Deep Learning Challenges
IBM France Lab
 
Application of gaussian filter with principal component analysis
IAEME Publication
 
Application of gaussian filter with principal component analysis
IAEME Publication
 
Obscenity Detection in Images
Anil Kumar Gupta
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Universitat Politècnica de Catalunya
 
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
Aalto University
 
ICCES 2017 - Crowd Density Estimation Method using Regression Analysis
Ahmed Gad
 
Raul sena - Apresentação Analiticsemtudo - Scientific Applications using GPU
Eduardo Gaspar
 
Image Maximization Using Multi Spectral Image Fusion Technique
dbpublications
 
2. visualization in data mining
Azad public school
 
Automating Software Development Using Artificial Intelligence (AI)
Jeremy Bradbury
 
Object Detection using Deep Neural Networks
Usman Qayyum
 
Towards Glyph-based Visualizations for Big Data Clustering
VANDA - Visual Analytics Interfaces for Big Data Environments
 
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
Paolo Missier
 
Visual Analytics Interfaces for Big Data Environments
VANDA - Visual Analytics Interfaces for Big Data Environments
 
Ad

Recently uploaded (17)

PPTX
Pastor Bob Stewart Acts 21 07 09 2025.pptx
FamilyWorshipCenterD
 
PDF
The Impact of Game Live Streaming on In-Game Purchases of Chinese Young Game ...
Shibaura Institute of Technology
 
PPTX
presentation on legal and regulatory action
raoharsh4122001
 
PPTX
Great-Books. Powerpoint presentation. files
tamayocrisgie
 
PDF
Buy Verified Coinbase Accounts — The Ultimate Guide for 2025 (Rank #1 on Goog...
Buy Verified Cash App Accounts
 
PPTX
BARRIERS TO EFFECTIVE COMMUNICATION.pptx
shraddham25
 
PDF
Buy Verified Payoneer Accounts — The Ultimate Guide for 2025 (Rank #1 on Goog...
Buy Verified Cash App Accounts
 
PDF
The Origin - A Simple Presentation on any project
RishabhDwivedi43
 
PPTX
Presentationexpressions You are student leader and have just come from a stud...
BENSTARBEATZ
 
PPTX
2025-07-06 Abraham 06 (shared slides).pptx
Dale Wells
 
PPTX
AI presentation for everyone in every fields
dodinhkhai1
 
PPTX
STURGEON BAY WI AG PPT JULY 6 2025.pptx
FamilyWorshipCenterD
 
PDF
From Draft to DSN - How to Get your Paper In [DSN 2025 Doctoral Forum Keynote]
vschiavoni
 
PPTX
Inspired by VeinSense: Supercharge Your Hackathon with Agentic AI
ShubhamSharma2528
 
PDF
The Family Secret (essence of loveliness)
Favour Biodun
 
PDF
Leveraging the Power of Jira Dashboard.pdf
siddharthshukla742740
 
PPTX
some leadership theories MBA management.pptx
rkseo19
 
Pastor Bob Stewart Acts 21 07 09 2025.pptx
FamilyWorshipCenterD
 
The Impact of Game Live Streaming on In-Game Purchases of Chinese Young Game ...
Shibaura Institute of Technology
 
presentation on legal and regulatory action
raoharsh4122001
 
Great-Books. Powerpoint presentation. files
tamayocrisgie
 
Buy Verified Coinbase Accounts — The Ultimate Guide for 2025 (Rank #1 on Goog...
Buy Verified Cash App Accounts
 
BARRIERS TO EFFECTIVE COMMUNICATION.pptx
shraddham25
 
Buy Verified Payoneer Accounts — The Ultimate Guide for 2025 (Rank #1 on Goog...
Buy Verified Cash App Accounts
 
The Origin - A Simple Presentation on any project
RishabhDwivedi43
 
Presentationexpressions You are student leader and have just come from a stud...
BENSTARBEATZ
 
2025-07-06 Abraham 06 (shared slides).pptx
Dale Wells
 
AI presentation for everyone in every fields
dodinhkhai1
 
STURGEON BAY WI AG PPT JULY 6 2025.pptx
FamilyWorshipCenterD
 
From Draft to DSN - How to Get your Paper In [DSN 2025 Doctoral Forum Keynote]
vschiavoni
 
Inspired by VeinSense: Supercharge Your Hackathon with Agentic AI
ShubhamSharma2528
 
The Family Secret (essence of loveliness)
Favour Biodun
 
Leveraging the Power of Jira Dashboard.pdf
siddharthshukla742740
 
some leadership theories MBA management.pptx
rkseo19
 
Ad

Application of deep leaning to computer vision

  • 1. Application of deep learning to computer vision Presented by: Djamal Abide
  • 2. Plan 1. Data science 2. Artificial intelligence 3. Computer vision 4. Deep Learning 5. Demo March 23, 2017 Djamal Abide 2
  • 3. Plan 1. Data science 2. Artificial intelligence 3. Computer vision 4. Deep Learning 5. Demo March 23, 2017 Djamal Abide 3
  • 4. Data Science Definition It’s is an interdisciplinary field about processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured March 23, 2017 Djamal Abide 4
  • 5. Examples of AI Applications Type Examples Monitoring 1. Detecting credit-card fraud 2. Cybersecurity intrusions Discovering 1. Genetics 2. Causal models for air transport safety Predicting 1. Netflix movies recommendation 2. Weather forecasting Interpreting 1. Face detection (images) 2. Pedestrian detection (videos) 3. Speech recognition (audios) March 23, 2017 Djamal Abide 5
  • 7. Ask Questions Research & Gather Data Formulate Hypothesis Test Hypothesis (Experiments) Analyze Results (Draw Conclusion) Report Results The Scientific Method March 23, 2017 Djamal Abide 7
  • 8. Plan 1. Data science 2. Artificial intelligence 3. Computer vision 4. Deep Learning 5. Demo March 23, 2017 Djamal Abide 8
  • 9. Artificial Intelligence Natural Language Processing (NLP) Computer Vision Robotics Problem-solving and planning Machine Learning Knowledge Representation Artificial Intelligence Research Fields March 23, 2017 Djamal Abide 9
  • 10. Plan 1. Data science 2. Artificial intelligence 3. Computer vision 4. Deep Learning 5. Demo March 23, 2017 Djamal Abide 10
  • 11. What is Computer Vision? It’s a field that includes methods for acquiring, processing, analyzing and understanding images from the real world in order to produce information in the form of decision. Applications • Recognize objects • Locate objects in space • Track objects • Recognize actions March 23, 2017 Djamal Abide 11
  • 14. Colored Image Data Structure Red, Green and Blue values are between: 0 and 255 Intensity values are between: 0 and 255 Gray Scaled Image Data Structure March 23, 2017 Djamal Abide 14
  • 15. Image Processing Examples Resized Gray Scale Edge Detection March 23, 2017 Djamal Abide 15
  • 16. Classical Program x f(x) y Machine Learning: f(x) function is Learned from the data Machine Learning vs Classical Program Input Data Program Implementing f(x) Result (x1, y1) (x2, y2) ... ML Algorithm Model f(x) Training Examples Result ML Program To Learn f(x) March 23, 2017 Djamal Abide 16
  • 17. Prediction Evaluation Prediction With Machine Learning Model Model f(x) x Prediction Tool Predicted y Predicted y Real y Comparison Tool Accuracy March 23, 2017 Djamal Abide 17
  • 18. March 23, 2017 Djamal Abide 18
  • 19. Plan 1. Data science 2. Artificial intelligence 3. Computer vision 4. Deep Learning 5. Demo March 23, 2017 Djamal Abide 19
  • 20. Source: https://i.ytimg.com/vi/osa3zIEJjgw/maxresdefault.jpg Human brain and Artificial Neural Networks Human brain doesn’t need features Activation function March 23, 2017 Djamal Abide 20
  • 22. Deep Learning Flow For Training Models Input data Preprocessing Enhanced Clean Data Features Extraction Features Deep Learning Model • Without clean data, Deep Learning cannot learn or discover patterns Traditional Machine Learning Flow For Training Models Input data Preprocessing Enhanced Clean Data Features Extraction Features (help in finding patterns) Tradition ML Algorithm Model • Clean data helps in engineering robust features • Without good features, ML algorithm cannot learn or discover patterns X X March 23, 2017 Djamal Abide 22
  • 23. Why it is hard to recognize objects? • Segmentation: Picture contains many objects • Lighting: Intensity of light • Deformation: Handwriting with many styles • Affordance: Objects labeled based on what they are used for. Example: chairs • Viewpoint: Picture could be taken from different angles March 23, 2017 Djamal Abide 23
  • 24. Convolutional layer March 23, 2017 Djamal Abide 24
  • 25. Pooling layer March 23, 2017 Djamal Abide 25
  • 26. LeNet: 1st successful CNN March 23, 2017 Djamal Abide 26 Source: http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf
  • 27. • MNIST has contains 70,000 of pictures 9 different digits • Format of a picture is 28 x 28 • Scientists use 60,000 pics to train and 10,000 pics for testing MNIST Database March 23, 2017 Djamal Abide 27
  • 28. Classifier Preprocessing Test Error Rate (%) Reference Linear Classifiers linear classifier (1-layer NN) none 12.0 LeCun et al. 1998 linear classifier (1-layer NN) deskewing 8.4 LeCun et al. 1998 pairwise linear classifier deskewing 7.6 LeCun et al. 1998 K-Nearest Neighbors K-nearest-neighbors, Euclidean (L2) none 5.0 LeCun et al. 1998 … K-NN, shape context matching shape context feature extraction 0.63 Belongie et al. IEEE PAMI 2002 Boosted Stumps boosted stumps none 7.7 Kegl et al., ICML 2009 … product of stumps on Haar features Haar features 0.87 Kegl et al., ICML 2009 Non-Linear Classifiers 40 PCA + quadratic classifier none 3.3 LeCun et al. 1998 1000 RBF + linear classifier none 3.6 LeCun et al. 1998 March 23, 2017 Djamal Abide 28
  • 29. Classifier Preprocessing Test Error Rate (%) Reference SVMs SVM, Gaussian Kernel none 1.4 … … … … Virtual SVM, deg-9 poly, 2-pixel jittered deskewing 0.56 DeCoste and Scholkopf, MLJ 2002 Neural Nets 2-layer NN, 300 hidden units, mean square error none 4.7 LeCun et al. 1998 … 6-layer NN 784-2500-2000-1500-1000- 500-10 (on GPU) [elastic distortions] none 0.35 Ciresan et al. Neural Computation 10, 2010 and arXiv 1003.0358, 201 Convolutional nets Convolutional net LeNet-1 subsampling to 16x16 pixels 1.7 LeCun et al. 1998 … committee of 35 conv. net, 1-20-P-40-P- 150-10 [elastic distortions] width normalization 0.23 Ciresan et al. CVPR 2012 Source: http://yann.lecun.com/exdb/mnist/ March 23, 2017 Djamal Abide 29
  • 30. Deep Learning: GPU versus CPU March 23, 2017 Djamal Abide 30 Source: http://www.nvidia.com/object/tesla-m40.html
  • 31. Large Scale Visual Recognition Challenge 2012 (ILSVRC2012) • Number of images: ~ 14 million • Number of categories: 1,000 • Team “SuperVision” formed by students of Professor Geoffrey Hinton from University of Toronto Alex Krizhevsky & Ilya Sutskever won ImageNet classification challenge with a large margin March 23, 2017 Djamal Abide 31
  • 32. Pros • Enable learning of features rather than hand tuning • Impressive performance gains in: – Computer vision – Speech recognition – Some text analysis • Potential for more impact Cons • Requires a lot of data for high accuracy • Computationally really expensive • Hard to tune: – Choice of architecture – Parameter types – Hyper-parameters – Learning algorithm – … Deep Learning: Pros & Cons March 23, 2017 Djamal Abide 32
  • 33. Advise • Use segmented images as training set • Use data augmentation technics • Don’t be a ‘hero’ trying to create your own Deep Neuronal Network (CNN) architecture, use an existing one • Use transfer learning (pre-trained models) March 23, 2017 Djamal Abide 33
  • 34. Plan 1. Data science 2. Artificial intelligence 3. Computer vision 4. Deep Learning 5. Demo March 23, 2017 Djamal Abide 34
  • 35. ConvNetJS (Deep Learning in your browser) • http://cs.stanford.edu/people/karpathy/convn etjs/index.html March 23, 2017 Djamal Abide 35
  • 36. March 23, 2017 Djamal Abide 36