SlideShare a Scribd company logo
‫ממוחשבת‬ ‫לראיה‬ ‫מבוא‬
‫שגיב‬ ‫חן‬
‫ומנכ‬ ‫מייסדת‬"‫בע‬ ‫שגיבטק‬ ‫משותפת‬ ‫לית‬"‫מ‬
‫ההרצאה‬ ‫תוכן‬
•‫ממוחשבת‬ ‫ראיה‬–‫טוב‬ ‫זה‬ ‫ולמה‬ ‫זה‬ ‫מה‬?
•‫בסיסיים‬ ‫דברים‬ ‫כמה‬
•‫ממוחשבת‬ ‫ראיה‬ ‫של‬ ‫קלסיקה‬
•‫ה‬ ‫מהפיכת‬-AI
•‫הממוחשבת‬ ‫הראיה‬ ‫לעולם‬ ‫כניסה‬
‫ממוחשבת‬ ‫ראיה‬ ‫מהי‬?
•‫כלים‬ ‫בעזרת‬ ‫מתמונות‬ ‫מידע‬ ‫והוצאת‬ ‫שיפור‬
‫ממוחשב‬ ‫ועיבוד‬ ‫מתמטיים‬
•‫מטרות‬:
–‫האנושי‬ ‫לצופה‬ ‫התמונה‬ ‫של‬ ‫מיטבית‬ ‫הצגה‬
–‫מידע‬ ‫והפקת‬ ‫תמונות‬ ‫של‬ ‫ממוחשבת‬ ‫אנליזה‬
‫ממוחשבת‬ ‫בראיה‬ ‫משתמשים‬ ‫היכן‬?
•‫רפואה‬
•‫אוטונומיים‬ ‫רכבים‬
•‫רבודה‬ ‫ומציאות‬ ‫מדומה‬ ‫מציאות‬
•‫תעשיה‬:‫פגמים‬ ‫איבחון‬
•‫חברתיות‬ ‫רשתות‬
•‫בטחונית‬ ‫תעשיה‬,‫אבטחה‬
‫ואכיפת‬‫חוק‬
•‫חקלאות‬
•‫וטיפוח‬ ‫אופנה‬
Taken from: http://www.cvl.isy.liu.se/
‫דיגיטלית‬ ‫תמונה‬ ‫מהי‬?
‫דיגיטלית‬ ‫תמונה‬ ‫נוצרת‬ ‫איך‬?
•‫ה‬"‫אופטיקה‬"-‫הגל‬ ‫לאורך‬ ‫שרגיש‬ ‫סנסור‬ ‫ידי‬ ‫על‬ ‫תמונה‬ ‫רכישת‬
‫המתאים‬:‫נראה‬ ‫אור‬,US,‫אינפרא‬-‫אדום‬
•‫שהתקבלה‬ ‫לאנרגיה‬ ‫פרופורציונלי‬ ‫חשמלי‬ ‫אות‬ ‫מייצר‬ ‫הסנסור‬
•‫האנלוגי‬ ‫האות‬ ‫דגימת‬
•DSP-‫בצבע‬ ‫טיפול‬,‫שגויים‬ ‫פיקסלים‬,‫רעש‬
‫דיגיטלית‬ ‫תמונה‬ ‫נוצרת‬ ‫איך‬?
‫האנושית‬ ‫הראיה‬ ‫מערכת‬
‫מ‬ ‫לקוח‬-Wikipedia
‫מרחבית‬ ‫דגימה‬
Taken from Digital Image Processing, Gonzalez
‫נייקוויסט‬ ‫ותדר‬ ‫הדגימה‬ ‫משפט‬
•‫דגום‬ ‫אות‬ ‫לשחזר‬ ‫כדי‬ ‫כי‬ ‫קובע‬ ‫זה‬ ‫משפט‬,‫קצב‬
‫האות‬ ‫מתדר‬ ‫כפליים‬ ‫לפחות‬ ‫להיות‬ ‫צריך‬ ‫הדגימה‬
‫הדגום‬.‫נקרא‬ ‫זה‬ ‫דגימה‬ ‫קצב‬:‫נייקויסט‬ ‫תדר‬
‫צריך‬ ‫אפור‬ ‫רמות‬ ‫כמה‬?
•‫ב‬ ‫משתמשים‬ ‫הצגה‬ ‫לצרכי‬-8‫ביט‬
•‫לצרכי‬‫בנות‬ ‫בתמונות‬ ‫מטפלים‬ ‫תמונה‬ ‫ניתוח‬10-
12‫יותר‬ ‫ואף‬ ‫ביט‬
Taken from Digital Image Processing, Gonzalez
‫בתמונה‬ ‫רעשים‬ ‫ניקוי‬
Linear Denoising Filters
Speckle Noise Gaussian Noise Salt & Pepper Noise
Median Filter
Speckle Noise Gaussian Noise Salt & Pepper Noise
‫שפות‬ ‫ומציאת‬ ‫סגמנטציה‬
‫סגמנטציה‬ ‫מבוססת‬ ‫מה‬ ‫על‬?
•‫אזורים‬ ‫בין‬ ‫מעבר‬ ‫יש‬ ‫בהן‬ ‫נקודות‬ ‫או‬ ‫השפה‬ ‫חיפוש‬–
‫שפות‬ ‫מבוססות‬ ‫שיטות‬
•‫שבו‬ ‫שהפיקסלים‬ ‫אזור‬ ‫הגדרת‬"‫דומים‬"‫לזה‬ ‫זה‬–
‫אזורים‬ ‫מבוססות‬ ‫שיטות‬
Gray-level profile
First derivative
Second derivative
Approximations
 Sobel
 Prewitt
-1-2-1
000
121
10-1
20-2
10-1
-1-1-1
000
111
10-1
10-1
10-1
Introduction talk to Computer Vision
Global Processing: The Hough Transform
•‫ע‬ ‫לתיאור‬ ‫ניתן‬ ‫בתמונה‬ ‫ישר‬ ‫כל‬"‫משוואה‬ ‫י‬.
•‫קווים‬ ‫אינסוף‬ ‫לעבור‬ ‫יכולים‬ ‫הישר‬ ‫על‬ ‫נקודה‬ ‫כל‬ ‫דרך‬
•‫בהתמרת‬Hough‫נקודה‬ ‫כל‬"‫מצביעה‬"‫הקווים‬ ‫עבור‬
‫דרכה‬ ‫לעבור‬ ‫שיכולים‬
•‫קולות‬ ‫הרבה‬ ‫הכי‬ ‫עם‬ ‫הישר‬-‫מנצח‬!
The Hough Transform
Introduction talk to Computer Vision
SIFT
Scale Invariant Transform
A motivating application
Building a panorama
• We need to match/align/register images
Taken from PPT of M. Brown and D. Lowe, University of British
Columbia
Building a panorama
1) Detect feature points in both images
Taken from PPT of M. Brown and D. Lowe, University of British
Columbia
Building a panorama
1. Detect feature points in both images
2. Find corresponding pairs
Taken from PPT of M. Brown and D. Lowe, University of British
Columbia
Building a panorama
1. Detect feature points in both images
2. Find corresponding pairs
3. Find a parametric transformation (e.g. homography)
4. Warp (right image to left image)
Taken from PPT of M. Brown and D. Lowe, University of British Columbia
Scale-Space
     , , , , ,L x y G x y I x y  
   2 2 2
2
2
1
, ,
2
x y
G x y e



 

DoG
http://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf
Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
Scale-Space Extrema Detection
• X is selected if it is larger or smaller than all 26
neighbors
Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
Keypoint Localization
Threshold on minimal contrast
Threshold on ratio of
principal curvatures
Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
Orientation assignment
• Create weighted histogram of
local gradient directions
computed at selected scale
• Assign canonical orientation at
peak of smoothed histogram
• For location of multiple peaks
multiply key point
Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
Keypoint descriptor
https://www.youtube.com/watch?v=FsFC8sCpDSw
Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
• If you’ve been to a concert recently, you’ve probably seen how many people take videos of the
event with mobile phone cameras
• Each user has only one video – taken from one angle and location and of only moderate quality
Mobile Crowdsourcing Video Scene
Reconstruction
Creation of the 3D Video Sequence
The scene is photographed by
several people using their cell
phone camera
The video data is
transmitted via the
cellular network to a
High Performance
Computing server.
Following time
synchronization, resolution
normalization and spatial
registration, the several videos
are merged into a 3-D video
cube.
TIME
Spatial Calibration
Feature detection +
Matching
Fundamental matrix
estimation
Global registration
Time & Audio Synchronization
• Precise : epipolar matching is both fast and accurate
• Dense multi-scale description of the images using binary descriptors
3D Model Reconstruction
• Precise : epipolar matching is both fast and accurate
• Empirical probability density check to discard false positives at occlusion
points
Correct match : max peak above other local max
Wrong match : max peak similar to other local max
3D Model Reconstruction
• Robust : works even with a minimal set of inputs
• two viewpoints already sufficient for dense reconstruction
• very few erroneous points
3Dreconstruction
3D Model Reconstruction
3D Visualizer for Dynamic Scenes
moving unknown
‫המחשב‬ ‫מן‬ ‫האדם‬ ‫מותר‬?
Kanizsa
‫ה‬ ‫מהפיכת‬-AI
• Image classification revolutionized by DL
– ImageNet – from 27% to ~5% in three years
What happened in ImageNet 2012 ?
• In this case, our classifier would have a decision boundary
more complex than the simple straight line.
• All the training patterns would be separated perfectly.
Learning Methods - Supervised
• Simpler recognizer  better performance on novel patterns.
• This is one of the central problems in statistical pattern
recognition.
Learning Methods - Supervised
• Feature extraction
 Discriminative features
 Invariant features with respect to translation, rotation and scale.
• Classification
 Use a feature vector provided by a feature extractor to assign the
object to a category
Learning Methods - Supervised
• A multi layered Neural Network (NN)
– With non linearity
• The input to the DL NN is presented at the input layer
– Images, sound, laguage...
• Hidden layers extracts increasingly abstract features
• Output layer contains the result
Neural Networks
• More complex neural networks with multiple layers and multiple output neurons are
theoretically capable of separation using any continuous surface.
• The straight line depicts the separation achieved by a simple Perceptron and the curve the
separation by a multi-layered network (left), which is in theory able to learn any separating
function.
Learning methods
Supervised
What can you do with Machine & Deep Learning ?
• Train classifiers for specific recognition tasks
• Localization
– fully convolutional connected networks
– train networks, e.g. RCNN, YOLO
The task: What object category do we see in the image?
Benchmark: 1000 category “ImageNet” dataset.
(from Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, ImageNet Classification with Deep
Convolutional Neural Networks, NIPS, 2012.)
Image Classification
2012 AlexNet results were what jump started the current neural network wave.
Current results (networks with tens/hundreds of layers) perform better than humans on this task!
Image Classification - continued
Task: Find bounding boxes and categories of objects in the image.
Benchmark: PASCAL VOC / Microsoft COCO
In right video: YOLO 2 (You Look Only Once)
Object Detection
https://www.youtube.com/watch?v=VOC3huqHrssg
Task: Assign each image pixel a category label (person, wall, road, dog, ..)
Example Dataset: PASCAL VOC2012 Challenge
SegModel, Deep-Labv2, and many more..
Image Semantic Segmentation
Task: Given an image and a question, answer the question.
Question Answering
Task: Given a dataset of images, generate new artificial image that look real!
State of the art: Generative Adversarial Networks (GANs)
Image Generation
Steering the wheels of self driving cars
Super resolution
Image completion
Saliency detection
Human pose detection
Facial keypoints (nose, eye, ear..)
Image captioning
Activity recognition
And many, many more tasks ….
SagivTech Traffic Lights Detection using DL
With dlib & a few
images from
Google street
view
https://www.youtube.com/watch?v=jg444J2AmOI
https://www.youtube.com/watch?v=YV4y1iqo_TQ
What can you do to get into this world ?
• Theory:
– Get to know basic computer vision and some “classical” algorithms, e.g.
Viola Jones, SIFT, etc.
– Get to know Deep Learning, e.g. CS 231 by Stanford
• Practice:
– Get to know and use OpenCV
– Hands on experience with Caffe, Tensor Flow etc.
Technology and Professional Services company
Established in 2009 and headquartered in Israel
SagivTech Snapshot
• What we do:
• Technological
Solutions
• Projects
• Research
• Core domains:
• Computer Vision
• Deep Learning
• Code Optimization
• GPU Computing
Introduction talk to Computer Vision
Thanks for the following SagivTech team members and
collaborators:
Acknowledgements
• Prof. Peter Maass, University of Bremen
• Prof. Pierre Vandergheynst, EPFL
• Dov Eilot, SagivTech
• Jacob Gildenblat, SagivTech
• Amir Egozi, SceneNet project
Thank You
F o r m o r e i n f o r m a t i o n p l e a s e c o n t a c t
C h e n S a g i v
c h e n @ s a g i v t e c h . c o m
+ 9 7 2 5 4 7 7 0 6 0 8 9

More Related Content

What's hot (20)

PDF
Machine learning and multimedia information retrieval
Si Krishan
 
PPTX
object recognition for robots
s1240148
 
PDF
Computer Vision
Kan-Han (John) Lu
 
PDF
G010245056
IOSR Journals
 
PDF
Image recognition
Nikhil Singh
 
PPT
Lecture 21 - Image Categorization - Computer Vision Spring2015
Jia-Bin Huang
 
PPTX
Computer vision series
Perry Lea
 
PPTX
Computer Vision
Nitin Sharma
 
PDF
An Approach for Object and Scene Detection for Blind Peoples Using Vocal Vision.
IJERA Editor
 
PDF
Image recognition
Joel Jose
 
PDF
A Framework for Human Action Detection via Extraction of Multimodal Features
CSCJournals
 
PPT
Computer vision for interactive computer graphics
Shah Alam Sabuj
 
PPT
Activity recognition for video surveillance
variable_orr
 
PPTX
Computer vision
AnkitKamal6
 
DOCX
Independent Research
NAVEEN KAVVADI
 
PDF
Open CV - 電腦怎麼看世界
Tech Podcast Night
 
PPTX
Computer vision
Md Nazmul Hossain Mir
 
PPTX
Computer vision introduction
Wael Badawy
 
DOC
Resume
butest
 
PPTX
Tagsense ppt
ushanagaraj
 
Machine learning and multimedia information retrieval
Si Krishan
 
object recognition for robots
s1240148
 
Computer Vision
Kan-Han (John) Lu
 
G010245056
IOSR Journals
 
Image recognition
Nikhil Singh
 
Lecture 21 - Image Categorization - Computer Vision Spring2015
Jia-Bin Huang
 
Computer vision series
Perry Lea
 
Computer Vision
Nitin Sharma
 
An Approach for Object and Scene Detection for Blind Peoples Using Vocal Vision.
IJERA Editor
 
Image recognition
Joel Jose
 
A Framework for Human Action Detection via Extraction of Multimodal Features
CSCJournals
 
Computer vision for interactive computer graphics
Shah Alam Sabuj
 
Activity recognition for video surveillance
variable_orr
 
Computer vision
AnkitKamal6
 
Independent Research
NAVEEN KAVVADI
 
Open CV - 電腦怎麼看世界
Tech Podcast Night
 
Computer vision
Md Nazmul Hossain Mir
 
Computer vision introduction
Wael Badawy
 
Resume
butest
 
Tagsense ppt
ushanagaraj
 

Similar to Introduction talk to Computer Vision (20)

PPTX
IntroComputerVision23.pptx
AneesAbbasi14
 
PPTX
1_Intro2ssssssssssssssssssssssssssssss2.pptx
larturo
 
PDF
imagerecognition-191220044946 (1).pdf
SUBHASHREESUDHANSUSE
 
PDF
Lec01 introduction
BaliThorat1
 
PPTX
Introduction to computer vision with Convoluted Neural Networks
MarcinJedyk
 
PDF
Deep Neural Networks Presentation
Bohdan Klimenko
 
PPTX
Introduction to computer vision
Marcin Jedyk
 
PDF
1.pdf
Tony Creat
 
PPTX
Presentation1.pptx
K Manjunath
 
PDF
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
Tulipp. Eu
 
PDF
Deep learning with_computer_vision
Anand Narayanan
 
PDF
Computer vision for transportation
Wanjin Yu
 
PDF
Modelling Framework of a Neural Object Recognition
IJERA Editor
 
PPTX
conv_nets.pptx
ssuser80a05c
 
PPT
Lecture 1, 2 - An Introduction ot Computer Vision
Aksam Iftikhar
 
PPTX
OBJECT RECOGNITION.pptxttrrttrtrtrrtrttt
shesnasuneer
 
PPTX
Computer vision introduction - What is computer vision
SakinaSayyed4
 
PDF
IRJET- A Survey on Object Recognition Using Deep Learning
IRJET Journal
 
PPTX
Computer vision old problems new solutions
Gopi Krishna Nuti
 
PDF
“Can You See What I See? The Power of Deep Learning,” a Presentation from Str...
Edge AI and Vision Alliance
 
IntroComputerVision23.pptx
AneesAbbasi14
 
1_Intro2ssssssssssssssssssssssssssssss2.pptx
larturo
 
imagerecognition-191220044946 (1).pdf
SUBHASHREESUDHANSUSE
 
Lec01 introduction
BaliThorat1
 
Introduction to computer vision with Convoluted Neural Networks
MarcinJedyk
 
Deep Neural Networks Presentation
Bohdan Klimenko
 
Introduction to computer vision
Marcin Jedyk
 
1.pdf
Tony Creat
 
Presentation1.pptx
K Manjunath
 
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
Tulipp. Eu
 
Deep learning with_computer_vision
Anand Narayanan
 
Computer vision for transportation
Wanjin Yu
 
Modelling Framework of a Neural Object Recognition
IJERA Editor
 
conv_nets.pptx
ssuser80a05c
 
Lecture 1, 2 - An Introduction ot Computer Vision
Aksam Iftikhar
 
OBJECT RECOGNITION.pptxttrrttrtrtrrtrttt
shesnasuneer
 
Computer vision introduction - What is computer vision
SakinaSayyed4
 
IRJET- A Survey on Object Recognition Using Deep Learning
IRJET Journal
 
Computer vision old problems new solutions
Gopi Krishna Nuti
 
“Can You See What I See? The Power of Deep Learning,” a Presentation from Str...
Edge AI and Vision Alliance
 
Ad

Recently uploaded (20)

PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Biography of Daniel Podor.pdf
Daniel Podor
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Ad

Introduction talk to Computer Vision

  • 1. ‫ממוחשבת‬ ‫לראיה‬ ‫מבוא‬ ‫שגיב‬ ‫חן‬ ‫ומנכ‬ ‫מייסדת‬"‫בע‬ ‫שגיבטק‬ ‫משותפת‬ ‫לית‬"‫מ‬
  • 2. ‫ההרצאה‬ ‫תוכן‬ •‫ממוחשבת‬ ‫ראיה‬–‫טוב‬ ‫זה‬ ‫ולמה‬ ‫זה‬ ‫מה‬? •‫בסיסיים‬ ‫דברים‬ ‫כמה‬ •‫ממוחשבת‬ ‫ראיה‬ ‫של‬ ‫קלסיקה‬ •‫ה‬ ‫מהפיכת‬-AI •‫הממוחשבת‬ ‫הראיה‬ ‫לעולם‬ ‫כניסה‬
  • 3. ‫ממוחשבת‬ ‫ראיה‬ ‫מהי‬? •‫כלים‬ ‫בעזרת‬ ‫מתמונות‬ ‫מידע‬ ‫והוצאת‬ ‫שיפור‬ ‫ממוחשב‬ ‫ועיבוד‬ ‫מתמטיים‬ •‫מטרות‬: –‫האנושי‬ ‫לצופה‬ ‫התמונה‬ ‫של‬ ‫מיטבית‬ ‫הצגה‬ –‫מידע‬ ‫והפקת‬ ‫תמונות‬ ‫של‬ ‫ממוחשבת‬ ‫אנליזה‬
  • 4. ‫ממוחשבת‬ ‫בראיה‬ ‫משתמשים‬ ‫היכן‬? •‫רפואה‬ •‫אוטונומיים‬ ‫רכבים‬ •‫רבודה‬ ‫ומציאות‬ ‫מדומה‬ ‫מציאות‬ •‫תעשיה‬:‫פגמים‬ ‫איבחון‬ •‫חברתיות‬ ‫רשתות‬ •‫בטחונית‬ ‫תעשיה‬,‫אבטחה‬ ‫ואכיפת‬‫חוק‬ •‫חקלאות‬ •‫וטיפוח‬ ‫אופנה‬ Taken from: http://www.cvl.isy.liu.se/
  • 6. ‫דיגיטלית‬ ‫תמונה‬ ‫נוצרת‬ ‫איך‬? •‫ה‬"‫אופטיקה‬"-‫הגל‬ ‫לאורך‬ ‫שרגיש‬ ‫סנסור‬ ‫ידי‬ ‫על‬ ‫תמונה‬ ‫רכישת‬ ‫המתאים‬:‫נראה‬ ‫אור‬,US,‫אינפרא‬-‫אדום‬ •‫שהתקבלה‬ ‫לאנרגיה‬ ‫פרופורציונלי‬ ‫חשמלי‬ ‫אות‬ ‫מייצר‬ ‫הסנסור‬ •‫האנלוגי‬ ‫האות‬ ‫דגימת‬ •DSP-‫בצבע‬ ‫טיפול‬,‫שגויים‬ ‫פיקסלים‬,‫רעש‬
  • 9. ‫מרחבית‬ ‫דגימה‬ Taken from Digital Image Processing, Gonzalez
  • 10. ‫נייקוויסט‬ ‫ותדר‬ ‫הדגימה‬ ‫משפט‬ •‫דגום‬ ‫אות‬ ‫לשחזר‬ ‫כדי‬ ‫כי‬ ‫קובע‬ ‫זה‬ ‫משפט‬,‫קצב‬ ‫האות‬ ‫מתדר‬ ‫כפליים‬ ‫לפחות‬ ‫להיות‬ ‫צריך‬ ‫הדגימה‬ ‫הדגום‬.‫נקרא‬ ‫זה‬ ‫דגימה‬ ‫קצב‬:‫נייקויסט‬ ‫תדר‬
  • 11. ‫צריך‬ ‫אפור‬ ‫רמות‬ ‫כמה‬? •‫ב‬ ‫משתמשים‬ ‫הצגה‬ ‫לצרכי‬-8‫ביט‬ •‫לצרכי‬‫בנות‬ ‫בתמונות‬ ‫מטפלים‬ ‫תמונה‬ ‫ניתוח‬10- 12‫יותר‬ ‫ואף‬ ‫ביט‬
  • 12. Taken from Digital Image Processing, Gonzalez
  • 14. Linear Denoising Filters Speckle Noise Gaussian Noise Salt & Pepper Noise
  • 15. Median Filter Speckle Noise Gaussian Noise Salt & Pepper Noise
  • 17. ‫סגמנטציה‬ ‫מבוססת‬ ‫מה‬ ‫על‬? •‫אזורים‬ ‫בין‬ ‫מעבר‬ ‫יש‬ ‫בהן‬ ‫נקודות‬ ‫או‬ ‫השפה‬ ‫חיפוש‬– ‫שפות‬ ‫מבוססות‬ ‫שיטות‬ •‫שבו‬ ‫שהפיקסלים‬ ‫אזור‬ ‫הגדרת‬"‫דומים‬"‫לזה‬ ‫זה‬– ‫אזורים‬ ‫מבוססות‬ ‫שיטות‬
  • 21. Global Processing: The Hough Transform •‫ע‬ ‫לתיאור‬ ‫ניתן‬ ‫בתמונה‬ ‫ישר‬ ‫כל‬"‫משוואה‬ ‫י‬. •‫קווים‬ ‫אינסוף‬ ‫לעבור‬ ‫יכולים‬ ‫הישר‬ ‫על‬ ‫נקודה‬ ‫כל‬ ‫דרך‬ •‫בהתמרת‬Hough‫נקודה‬ ‫כל‬"‫מצביעה‬"‫הקווים‬ ‫עבור‬ ‫דרכה‬ ‫לעבור‬ ‫שיכולים‬ •‫קולות‬ ‫הרבה‬ ‫הכי‬ ‫עם‬ ‫הישר‬-‫מנצח‬!
  • 25. A motivating application Building a panorama • We need to match/align/register images Taken from PPT of M. Brown and D. Lowe, University of British Columbia
  • 26. Building a panorama 1) Detect feature points in both images Taken from PPT of M. Brown and D. Lowe, University of British Columbia
  • 27. Building a panorama 1. Detect feature points in both images 2. Find corresponding pairs Taken from PPT of M. Brown and D. Lowe, University of British Columbia
  • 28. Building a panorama 1. Detect feature points in both images 2. Find corresponding pairs 3. Find a parametric transformation (e.g. homography) 4. Warp (right image to left image) Taken from PPT of M. Brown and D. Lowe, University of British Columbia
  • 29. Scale-Space      , , , , ,L x y G x y I x y      2 2 2 2 2 1 , , 2 x y G x y e      
  • 30. DoG http://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
  • 31. Scale-Space Extrema Detection • X is selected if it is larger or smaller than all 26 neighbors Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
  • 32. Keypoint Localization Threshold on minimal contrast Threshold on ratio of principal curvatures Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
  • 33. Orientation assignment • Create weighted histogram of local gradient directions computed at selected scale • Assign canonical orientation at peak of smoothed histogram • For location of multiple peaks multiply key point Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
  • 34. Keypoint descriptor https://www.youtube.com/watch?v=FsFC8sCpDSw Taken from David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
  • 35. • If you’ve been to a concert recently, you’ve probably seen how many people take videos of the event with mobile phone cameras • Each user has only one video – taken from one angle and location and of only moderate quality Mobile Crowdsourcing Video Scene Reconstruction
  • 36. Creation of the 3D Video Sequence The scene is photographed by several people using their cell phone camera The video data is transmitted via the cellular network to a High Performance Computing server. Following time synchronization, resolution normalization and spatial registration, the several videos are merged into a 3-D video cube. TIME
  • 38. Feature detection + Matching Fundamental matrix estimation Global registration
  • 39. Time & Audio Synchronization
  • 40. • Precise : epipolar matching is both fast and accurate • Dense multi-scale description of the images using binary descriptors 3D Model Reconstruction
  • 41. • Precise : epipolar matching is both fast and accurate • Empirical probability density check to discard false positives at occlusion points Correct match : max peak above other local max Wrong match : max peak similar to other local max 3D Model Reconstruction
  • 42. • Robust : works even with a minimal set of inputs • two viewpoints already sufficient for dense reconstruction • very few erroneous points 3Dreconstruction 3D Model Reconstruction
  • 43. 3D Visualizer for Dynamic Scenes moving unknown
  • 46. • Image classification revolutionized by DL – ImageNet – from 27% to ~5% in three years What happened in ImageNet 2012 ?
  • 47. • In this case, our classifier would have a decision boundary more complex than the simple straight line. • All the training patterns would be separated perfectly. Learning Methods - Supervised
  • 48. • Simpler recognizer  better performance on novel patterns. • This is one of the central problems in statistical pattern recognition. Learning Methods - Supervised
  • 49. • Feature extraction  Discriminative features  Invariant features with respect to translation, rotation and scale. • Classification  Use a feature vector provided by a feature extractor to assign the object to a category Learning Methods - Supervised
  • 50. • A multi layered Neural Network (NN) – With non linearity • The input to the DL NN is presented at the input layer – Images, sound, laguage... • Hidden layers extracts increasingly abstract features • Output layer contains the result Neural Networks
  • 51. • More complex neural networks with multiple layers and multiple output neurons are theoretically capable of separation using any continuous surface. • The straight line depicts the separation achieved by a simple Perceptron and the curve the separation by a multi-layered network (left), which is in theory able to learn any separating function. Learning methods Supervised
  • 52. What can you do with Machine & Deep Learning ? • Train classifiers for specific recognition tasks • Localization – fully convolutional connected networks – train networks, e.g. RCNN, YOLO
  • 53. The task: What object category do we see in the image? Benchmark: 1000 category “ImageNet” dataset. (from Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, NIPS, 2012.) Image Classification
  • 54. 2012 AlexNet results were what jump started the current neural network wave. Current results (networks with tens/hundreds of layers) perform better than humans on this task! Image Classification - continued
  • 55. Task: Find bounding boxes and categories of objects in the image. Benchmark: PASCAL VOC / Microsoft COCO In right video: YOLO 2 (You Look Only Once) Object Detection https://www.youtube.com/watch?v=VOC3huqHrssg
  • 56. Task: Assign each image pixel a category label (person, wall, road, dog, ..) Example Dataset: PASCAL VOC2012 Challenge SegModel, Deep-Labv2, and many more.. Image Semantic Segmentation
  • 57. Task: Given an image and a question, answer the question. Question Answering
  • 58. Task: Given a dataset of images, generate new artificial image that look real! State of the art: Generative Adversarial Networks (GANs) Image Generation
  • 59. Steering the wheels of self driving cars Super resolution Image completion Saliency detection Human pose detection Facial keypoints (nose, eye, ear..) Image captioning Activity recognition And many, many more tasks ….
  • 60. SagivTech Traffic Lights Detection using DL With dlib & a few images from Google street view https://www.youtube.com/watch?v=jg444J2AmOI https://www.youtube.com/watch?v=YV4y1iqo_TQ
  • 61. What can you do to get into this world ? • Theory: – Get to know basic computer vision and some “classical” algorithms, e.g. Viola Jones, SIFT, etc. – Get to know Deep Learning, e.g. CS 231 by Stanford • Practice: – Get to know and use OpenCV – Hands on experience with Caffe, Tensor Flow etc.
  • 62. Technology and Professional Services company Established in 2009 and headquartered in Israel SagivTech Snapshot • What we do: • Technological Solutions • Projects • Research • Core domains: • Computer Vision • Deep Learning • Code Optimization • GPU Computing
  • 64. Thanks for the following SagivTech team members and collaborators: Acknowledgements • Prof. Peter Maass, University of Bremen • Prof. Pierre Vandergheynst, EPFL • Dov Eilot, SagivTech • Jacob Gildenblat, SagivTech • Amir Egozi, SceneNet project
  • 65. Thank You F o r m o r e i n f o r m a t i o n p l e a s e c o n t a c t C h e n S a g i v c h e n @ s a g i v t e c h . c o m + 9 7 2 5 4 7 7 0 6 0 8 9