SlideShare a Scribd company logo
2
Most read
3
Most read
6
Most read
© 2021 Synopsys
Case Study:
Facial Detection & Recognition
for Always-On Applications
Jamie Campbell
Synopsys
© 2021 Synopsys
• Easy task for humans…but much harder for machines
• Useful biometric identification technique
• Advantages
• Works using inexpensive camera sensors
• Does not require physical interaction from the user
The challenge of identifying and verifying faces from images:
“Who is this person?” and “Is this the person?”
Face Recognition – An Introduction
© 2021 Synopsys
Face Recognition – Some History
• First attempts to use computers to recognize faces happened in the 1960s
• Required manual recording of facial features
• Technology of the time limited developments
• Linear Algebra and the “Eigenface” approach
allowed for significant developments in the field
• Used as a basis for many deep
learning algorithms
• Accuracies improved significantly
in the 2000s and 2010s
Sources: https://anyconnect.com/blog/the-history-of-facial-recognition-technologies & https://en.wikipedia.org/wiki/File:RAND_Tablet.png
© 2021 Synopsys
Face Recognition in Embedded Computing
New applications domains demand on-device face recognition
Smart Security
Door Locks
Smart Phones
Unlocking Laptops Payment Devices
Vending Machines, Parking Meters etc
© 2021 Synopsys
• Embedded Face Recognition systems must
✓Perform always-on monitoring
✓Be low-power (to support battery-powered scenarios)
✓Respond to inputs in real-time
✓Be capable of handling processing requirements of complex face recognition algorithms
Embedded Face Recognition
System Characteristics
Balance design constraints of an always-on, deeply-embedded device AND
performance demands of complex face detection NN networks
DESIGN CHALLENGE
© 2021 Synopsys
Steps to Recognizing a Face
1. Detecting a face: Is there a face in the input image?
2. Locating a face: Where is the face in the input image?
3. Identifying a face: Extract features and match with a database
Can we do this efficiently in an embedded system?
Each step requires a different amount and type of computation
© 2021 Synopsys
Solution:
Use a phased approach for Detection & Feature Extraction
Phase 2 – Confirm there’s a face and find it in the frame
• Execute a high-accuracy face detector NN graph when signaled
• Leverage dedicated NN accelerator to minimize execution time of
most complex algorithm
Phase 3 – Compute the “face embedding” vector
• Vector represents properties of a certain face – can be compared
with database
Phase 1 – Is there a face?
• Low complexity “face detector” NN classifier graph executes
continuously
• Uses simple algorithms and efficient hardware to minimize “always-
on” energy consumption
• Trigger event: Signal “Yes” detections to next phase
“Yes” “No”
0 20000 40000 60000 80000
Phase 3
Phase 2
Phase 1
Complexity (MACs) of each
phase normalized to Phase 1
1x
59000x
2500x
© 2021 Synopsys
Wake up
signal
Processor 2 in power-down state unless
woken up by Processor 1
Phase 2
Processor 2
High Accuracy Face
Detection NN
Share the work between a low-power and a high-performance processor to
achieve application power targets
Dividing up the Job
Phase 3
Face Recognition NN
Database
matching
“Always On”
Low-res Face
Detection NN
Processor 1
Low-power core
continuously
monitors for faces
ISP
Phase 1
© 2021 Synopsys
Wake up
signal
Processor 2 in power-down state unless
woken up by Processor 1
Phase 2
Processor 2
High Accuracy Face
Detection NN
Share the work between a low-power and a high-performance processor to
achieve our power targets
Dividing up the Job
Phase 3
Face Recognition NN
Database
matching
“Always On”
Low-res Face
Detection NN
Processor 1
Low-power core
continuously
monitors for faces
ISP
Phase 1
ARC EV Processor
DNN Accelerator
for high efficiency NN processing
Vector Engine
ARC EM DSP
Processor
Always-on
Face Detection
NN
© 2021 Synopsys
Processor 1: Synopsys ARC EM9D
• RISC core with DSP ISA extensions
• Includes key features for efficient DSP/NN
processing
• Vectorized Multiply Accumulate
• Zero-overhead looping
• Fast XY memory & address generation
units for instruction-level parallelism
• Optimized NN libraries
• RTOS options for more complex control and
interface tasks
Efficiently Executes Always-On AI Workloads
© 2021 Synopsys
• Simple binary classifier model – “Face” or “No Face”
• Low res 36x36 input
• Executes efficiently using
optimized NN libraries available
for EM9D
• Eg: Run inference 4x per second – ensures
real-time response for target application
• Clock ARC EM processor just fast enough to
meet inference rate
Is there a person looking at the camera?
Phase 1: Simple “Face/No Face” Detection on ARC EM9D
Reference: http://parse.ele.tue.nl/system/attachments/11/original/paperspeedsigncnn.pdf
Face/No Face
4-layer CNN algorithm
© 2021 Synopsys
Processor 2: Synopsys ARC EV7x
• ARC EV7x Vision Processors include
• Up to four enhanced vector processing
units (VPUs)
• DNN accelerator with up to 3520
MACs
• Provides scalability for performance vs
power tradeoffs
• Designed for maximum power efficiency for
neural network processing
Licensable
Licensable
Vector Engine
1, 2 or 4 VPU configurations
DNN Accelerator
880 to 3520 MAC configurations
Trace
Power Mgmt.
Sync & Debug
AXI Interfaces
DMA
Coherency
Shared Memory
Closely Coupled
Memories
MetaWare EV Development Toolkit
OpenCL™
C, C/C++
Development Tools
OpenCV, OpenVX™
Libraries & Runtime
Simulators,
Virtual Platforms
NN SDK
DMA
VPU
4
VF
P
U
VC
C
M
Ca
ch
e
VPU
3
VF
P
U
VC
C
M
Ca
ch
e
VPU
2
VF
P
U
VC
C
M
Ca
ch
e
VPU
1
512-bit
vector DSP
32-bit
scalar
VFPU
VCCM
Cache
Synopsys DesignWare ARC EV7x Processor
Convolutions 2D
Fully Connected Layers
Activations
© 2021 Synopsys
Phase 2: Face Detection on ARC EV7x
• Runs only after wake-up trigger event from Phase 1
• Can be real detection or a false positive
• Eg: Once every 60 secs (very busy door-lock
camera + some false detections)
• Localize the face in the camera input frame
• Eg: MobileNet-SSD graph with 416x416 input
• Most complex phase – needs to execute efficiently
• Use ARC EV7x DNN accelerator
Face
Detector
NN
Graph
Face
Detector
NN
Graph
No Face
Detected
Wake up Signal
Confirm there’s a face and find it in the frame
© 2021 Synopsys
Phase 3: Face Recognition on ARC EV7x
• Executed after successful Phase 2 detection
• Extract face embedding vectors suitable for
database lookup
• FaceNet (MobileNetv2-based)
• Use Phase 2’s detection bounding box as input
• Leverage ARC EV7x’s DNN accelerator
• Possible to have both graphs loaded at same
time
Compute the “face embedding” vector
Face
Recognition
NN
Graph
128 entry
embedding
vector
© 2021 Synopsys
Facts
• For both ARC EM and ARC EV, Phase 1 compute energy is not dominant
• Phase 1 workload is exceedingly simple for the ARC EV processor
• ARC EV will be idle most of time (only 4 inferences/second over 60 seconds)
Options for idle ARC EV
1. Sleep mode -> Standby energy
2. Power down -> Boot up energy
Question: Could we use ARC EV for Phase 1?
Justifying the Phased Heterogenous Solution for Phase 1
Metric ARC EM
always-on
ARC EV
sleep when
idle
ARC EV
power down
when idle
Standby Energy
Consumption
1x 12x 3x
ARC EM is Highly Energy-Efficient for Always-On Tasks
© 2021 Synopsys
Facts
• Phase 2 and 3 compute are complex workloads
• ARC EM-based execution would not be real-time (or
would need an unreasonable clock speed to maintain
inference rate)
• ARC EV provides significantly more compute
parallelization so is more energy-efficient
Question: Could we use ARC EM for Phase 2 and Phase 3?
Justifying the Phased Heterogenous Solution for Phases 2 & 3
Compute time EM EV
Phase 2 410x 1x
Phase 3 185x 1x
Energy EM EV
Phase 2 12x 1x
Phase 3 12x 1x
ARC EV Offers Real-Time Performance and Power Efficiency for Complex Workloads
© 2021 Synopsys
Conclusion
• Face Recognition is multi-step problem
• For embedded environments, power minimization is key
• Distribute the workload across heterogeneous cores
• Combine strengths of low-power + high-performance cores to
achieve energy-saving goals
• Synopsys DesignWare IP offers scalable processors like the ARC EM
and ARC EV7x families which are suitable for this work
© 2021 Synopsys
Resources
Resources
Power Efficient Facial Detection & Recognition
with ARC Processors
https://www.synopsys.com/designware-ip/technical-bulletin/face-
recognition-detection-arc-ev.html
Say Welcome to the Machine. Low-Power
Machine Learning for Smart IoT Applications
https://www.synopsys.com/dw/doc.php/wp/arc_low_power_machin
e_learning_for_iot.pdf
FDDB: A Benchmark for Face Detection in
Unconstrained Settings
http://vis-www.cs.umass.edu/fddb/fddb.pdf
The History of Facial Recognition Technologies:
How Image Recognition Got So Advanced
https://anyconnect.com/blog/the-history-of-facial-recognition-
technologies
18
2021 Embedded Vision Summit
Demo 1: SR-GAN Super Resolution on
DesignWare ARC EV7x Processors
Demo 2: Simultaneous Localization and
Mapping Acceleration (SLAM) on DesignWare
ARC EV7x Processors
View and Q&A:
• May 27: 12:00 pm - 1:00 pm PT
• May 28: 10:00 am - 11:00 am PT
Thank You

More Related Content

What's hot (20)

PDF
Edge linking in image processing
VARUN KUMAR
 
PPTX
Smoothing Filters in Spatial Domain
Madhu Bala
 
PPTX
Image processing ppt
Raviteja Chowdary Adusumalli
 
PPTX
fuzzy image processing
amalalhait
 
PPTX
Histogram Equalization
Kalyan Acharjya
 
PDF
Feature Extraction
skylian
 
PDF
4.intensity transformations
Yahya Alkhaldi
 
PPTX
Image enhancement techniques
Saideep
 
PPT
Chapter 2 Image Processing: Pixel Relation
Varun Ojha
 
PDF
Detection of Malarial Parasite in Blood Using Image Processing
Associate Professor in VSB Coimbatore
 
PDF
=SLAM ppt.pdf
usmanarif88
 
PPTX
Chapter 9 morphological image processing
Ahmed Daoud
 
PPT
Spatial filtering
shabanam tamboli
 
PDF
digital image processing, image processing
Kalyan Acharjya
 
PPT
ImageProcessing10-Segmentation(Thresholding) (1).ppt
VikramBarapatre2
 
PDF
Digital Image Processing: Digital Image Fundamentals
Mostafa G. M. Mostafa
 
PDF
Object tracking final
MrsShwetaBanait1
 
PDF
Optimization for Deep Learning
Sebastian Ruder
 
PPTX
Deep Learning A-Z™: Convolutional Neural Networks (CNN) - Step 1: Convolution...
Kirill Eremenko
 
Edge linking in image processing
VARUN KUMAR
 
Smoothing Filters in Spatial Domain
Madhu Bala
 
Image processing ppt
Raviteja Chowdary Adusumalli
 
fuzzy image processing
amalalhait
 
Histogram Equalization
Kalyan Acharjya
 
Feature Extraction
skylian
 
4.intensity transformations
Yahya Alkhaldi
 
Image enhancement techniques
Saideep
 
Chapter 2 Image Processing: Pixel Relation
Varun Ojha
 
Detection of Malarial Parasite in Blood Using Image Processing
Associate Professor in VSB Coimbatore
 
=SLAM ppt.pdf
usmanarif88
 
Chapter 9 morphological image processing
Ahmed Daoud
 
Spatial filtering
shabanam tamboli
 
digital image processing, image processing
Kalyan Acharjya
 
ImageProcessing10-Segmentation(Thresholding) (1).ppt
VikramBarapatre2
 
Digital Image Processing: Digital Image Fundamentals
Mostafa G. M. Mostafa
 
Object tracking final
MrsShwetaBanait1
 
Optimization for Deep Learning
Sebastian Ruder
 
Deep Learning A-Z™: Convolutional Neural Networks (CNN) - Step 1: Convolution...
Kirill Eremenko
 

Similar to “Case Study: Facial Detection and Recognition for Always-On Applications,” a Presentation from Synopsys (20)

PPTX
Biometric Recognition using Deep Learning
SahithiKotha2
 
PDF
Mirko Lucchese - Deep Image Processing
MeetupDataScienceRoma
 
PDF
“AI-ISP: Adding Real-time AI Functionality to Image Signal Processing with Re...
Edge AI and Vision Alliance
 
PDF
Convolutional Neural Network for pixel-wise skyline detection
Darian Frajberg
 
PDF
Programming The Real World
pauldeng
 
PPTX
Introduction to computer vision with Convoluted Neural Networks
MarcinJedyk
 
PPTX
Scalable image recognition model with deep embedding
捷恩 蔡
 
PPTX
Introduction to computer vision
Marcin Jedyk
 
PPTX
Tìm hiểu về CNN và ResNet | Computer Vision
NgAnhTun27
 
PPTX
Cvpr 2018 papers review (efficient computing)
DonghyunKang12
 
PPTX
Proposal.pptx
qwerty155046
 
RTF
BDL_project_report
Shobha Vissapragada
 
PDF
eCognition Image Analysis System
CAPIGI
 
PPTX
Face recognition system
ShitanshuRanjanSriva2
 
PPT
lec6a.ppt
SaadMemon23
 
PDF
“SensPro2 Highly Scalable Sensor Hub DSP for Computer Vision, AI and Multi-se...
Edge AI and Vision Alliance
 
PDF
"Designing a Stereo IP Camera From Scratch," a Presentation from ELVEES
Edge AI and Vision Alliance
 
PDF
“Meeting the Critical Needs of Accuracy, Performance and Adaptability in Embe...
Edge AI and Vision Alliance
 
PDF
Caffe2C: A Framework for Easy Implementation of CNN-based Mobile Applications
Ryosuke Tanno
 
PPT
Multilin™ Intelligent Line Monitoring System
Corporación Eléctrica del Ecuador, CELEC EP
 
Biometric Recognition using Deep Learning
SahithiKotha2
 
Mirko Lucchese - Deep Image Processing
MeetupDataScienceRoma
 
“AI-ISP: Adding Real-time AI Functionality to Image Signal Processing with Re...
Edge AI and Vision Alliance
 
Convolutional Neural Network for pixel-wise skyline detection
Darian Frajberg
 
Programming The Real World
pauldeng
 
Introduction to computer vision with Convoluted Neural Networks
MarcinJedyk
 
Scalable image recognition model with deep embedding
捷恩 蔡
 
Introduction to computer vision
Marcin Jedyk
 
Tìm hiểu về CNN và ResNet | Computer Vision
NgAnhTun27
 
Cvpr 2018 papers review (efficient computing)
DonghyunKang12
 
Proposal.pptx
qwerty155046
 
BDL_project_report
Shobha Vissapragada
 
eCognition Image Analysis System
CAPIGI
 
Face recognition system
ShitanshuRanjanSriva2
 
lec6a.ppt
SaadMemon23
 
“SensPro2 Highly Scalable Sensor Hub DSP for Computer Vision, AI and Multi-se...
Edge AI and Vision Alliance
 
"Designing a Stereo IP Camera From Scratch," a Presentation from ELVEES
Edge AI and Vision Alliance
 
“Meeting the Critical Needs of Accuracy, Performance and Adaptability in Embe...
Edge AI and Vision Alliance
 
Caffe2C: A Framework for Easy Implementation of CNN-based Mobile Applications
Ryosuke Tanno
 
Multilin™ Intelligent Line Monitoring System
Corporación Eléctrica del Ecuador, CELEC EP
 
Ad

More from Edge AI and Vision Alliance (20)

PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
“ONNX and Python to C++: State-of-the-art Graph Compilation,” a Presentation ...
Edge AI and Vision Alliance
 
PDF
“Beyond the Demo: Turning Computer Vision Prototypes into Scalable, Cost-effe...
Edge AI and Vision Alliance
 
PDF
“Running Accelerated CNNs on Low-power Microcontrollers Using Arm Ethos-U55, ...
Edge AI and Vision Alliance
 
PDF
“Scaling i.MX Applications Processors’ Native Edge AI with Discrete AI Accele...
Edge AI and Vision Alliance
 
PDF
“A Re-imagination of Embedded Vision System Design,” a Presentation from Imag...
Edge AI and Vision Alliance
 
PDF
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
Edge AI and Vision Alliance
 
PDF
“Evolving Inference Processor Software Stacks to Support LLMs,” a Presentatio...
Edge AI and Vision Alliance
 
PDF
“Efficiently Registering Depth and RGB Images,” a Presentation from eInfochips
Edge AI and Vision Alliance
 
PDF
“How to Right-size and Future-proof a Container-first Edge AI Infrastructure,...
Edge AI and Vision Alliance
 
PDF
“Image Tokenization for Distributed Neural Cascades,” a Presentation from Goo...
Edge AI and Vision Alliance
 
PDF
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
Edge AI and Vision Alliance
 
PDF
“Bridging the Gap: Streamlining the Process of Deploying AI onto Processors,”...
Edge AI and Vision Alliance
 
PDF
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
Edge AI and Vision Alliance
 
PDF
“Addressing Evolving AI Model Challenges Through Memory and Storage,” a Prese...
Edge AI and Vision Alliance
 
PDF
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
Edge AI and Vision Alliance
 
PDF
“Solving Tomorrow’s AI Problems Today with Cadence’s Newest Processor,” a Pre...
Edge AI and Vision Alliance
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
“ONNX and Python to C++: State-of-the-art Graph Compilation,” a Presentation ...
Edge AI and Vision Alliance
 
“Beyond the Demo: Turning Computer Vision Prototypes into Scalable, Cost-effe...
Edge AI and Vision Alliance
 
“Running Accelerated CNNs on Low-power Microcontrollers Using Arm Ethos-U55, ...
Edge AI and Vision Alliance
 
“Scaling i.MX Applications Processors’ Native Edge AI with Discrete AI Accele...
Edge AI and Vision Alliance
 
“A Re-imagination of Embedded Vision System Design,” a Presentation from Imag...
Edge AI and Vision Alliance
 
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
Edge AI and Vision Alliance
 
“Evolving Inference Processor Software Stacks to Support LLMs,” a Presentatio...
Edge AI and Vision Alliance
 
“Efficiently Registering Depth and RGB Images,” a Presentation from eInfochips
Edge AI and Vision Alliance
 
“How to Right-size and Future-proof a Container-first Edge AI Infrastructure,...
Edge AI and Vision Alliance
 
“Image Tokenization for Distributed Neural Cascades,” a Presentation from Goo...
Edge AI and Vision Alliance
 
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
Edge AI and Vision Alliance
 
“Bridging the Gap: Streamlining the Process of Deploying AI onto Processors,”...
Edge AI and Vision Alliance
 
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
Edge AI and Vision Alliance
 
“Addressing Evolving AI Model Challenges Through Memory and Storage,” a Prese...
Edge AI and Vision Alliance
 
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
Edge AI and Vision Alliance
 
“Solving Tomorrow’s AI Problems Today with Cadence’s Newest Processor,” a Pre...
Edge AI and Vision Alliance
 
Ad

Recently uploaded (20)

PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
Advancing WebDriver BiDi support in WebKit
Igalia
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Biography of Daniel Podor.pdf
Daniel Podor
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Advancing WebDriver BiDi support in WebKit
Igalia
 

“Case Study: Facial Detection and Recognition for Always-On Applications,” a Presentation from Synopsys

  • 1. © 2021 Synopsys Case Study: Facial Detection & Recognition for Always-On Applications Jamie Campbell Synopsys
  • 2. © 2021 Synopsys • Easy task for humans…but much harder for machines • Useful biometric identification technique • Advantages • Works using inexpensive camera sensors • Does not require physical interaction from the user The challenge of identifying and verifying faces from images: “Who is this person?” and “Is this the person?” Face Recognition – An Introduction
  • 3. © 2021 Synopsys Face Recognition – Some History • First attempts to use computers to recognize faces happened in the 1960s • Required manual recording of facial features • Technology of the time limited developments • Linear Algebra and the “Eigenface” approach allowed for significant developments in the field • Used as a basis for many deep learning algorithms • Accuracies improved significantly in the 2000s and 2010s Sources: https://anyconnect.com/blog/the-history-of-facial-recognition-technologies & https://en.wikipedia.org/wiki/File:RAND_Tablet.png
  • 4. © 2021 Synopsys Face Recognition in Embedded Computing New applications domains demand on-device face recognition Smart Security Door Locks Smart Phones Unlocking Laptops Payment Devices Vending Machines, Parking Meters etc
  • 5. © 2021 Synopsys • Embedded Face Recognition systems must ✓Perform always-on monitoring ✓Be low-power (to support battery-powered scenarios) ✓Respond to inputs in real-time ✓Be capable of handling processing requirements of complex face recognition algorithms Embedded Face Recognition System Characteristics Balance design constraints of an always-on, deeply-embedded device AND performance demands of complex face detection NN networks DESIGN CHALLENGE
  • 6. © 2021 Synopsys Steps to Recognizing a Face 1. Detecting a face: Is there a face in the input image? 2. Locating a face: Where is the face in the input image? 3. Identifying a face: Extract features and match with a database Can we do this efficiently in an embedded system? Each step requires a different amount and type of computation
  • 7. © 2021 Synopsys Solution: Use a phased approach for Detection & Feature Extraction Phase 2 – Confirm there’s a face and find it in the frame • Execute a high-accuracy face detector NN graph when signaled • Leverage dedicated NN accelerator to minimize execution time of most complex algorithm Phase 3 – Compute the “face embedding” vector • Vector represents properties of a certain face – can be compared with database Phase 1 – Is there a face? • Low complexity “face detector” NN classifier graph executes continuously • Uses simple algorithms and efficient hardware to minimize “always- on” energy consumption • Trigger event: Signal “Yes” detections to next phase “Yes” “No” 0 20000 40000 60000 80000 Phase 3 Phase 2 Phase 1 Complexity (MACs) of each phase normalized to Phase 1 1x 59000x 2500x
  • 8. © 2021 Synopsys Wake up signal Processor 2 in power-down state unless woken up by Processor 1 Phase 2 Processor 2 High Accuracy Face Detection NN Share the work between a low-power and a high-performance processor to achieve application power targets Dividing up the Job Phase 3 Face Recognition NN Database matching “Always On” Low-res Face Detection NN Processor 1 Low-power core continuously monitors for faces ISP Phase 1
  • 9. © 2021 Synopsys Wake up signal Processor 2 in power-down state unless woken up by Processor 1 Phase 2 Processor 2 High Accuracy Face Detection NN Share the work between a low-power and a high-performance processor to achieve our power targets Dividing up the Job Phase 3 Face Recognition NN Database matching “Always On” Low-res Face Detection NN Processor 1 Low-power core continuously monitors for faces ISP Phase 1 ARC EV Processor DNN Accelerator for high efficiency NN processing Vector Engine ARC EM DSP Processor Always-on Face Detection NN
  • 10. © 2021 Synopsys Processor 1: Synopsys ARC EM9D • RISC core with DSP ISA extensions • Includes key features for efficient DSP/NN processing • Vectorized Multiply Accumulate • Zero-overhead looping • Fast XY memory & address generation units for instruction-level parallelism • Optimized NN libraries • RTOS options for more complex control and interface tasks Efficiently Executes Always-On AI Workloads
  • 11. © 2021 Synopsys • Simple binary classifier model – “Face” or “No Face” • Low res 36x36 input • Executes efficiently using optimized NN libraries available for EM9D • Eg: Run inference 4x per second – ensures real-time response for target application • Clock ARC EM processor just fast enough to meet inference rate Is there a person looking at the camera? Phase 1: Simple “Face/No Face” Detection on ARC EM9D Reference: http://parse.ele.tue.nl/system/attachments/11/original/paperspeedsigncnn.pdf Face/No Face 4-layer CNN algorithm
  • 12. © 2021 Synopsys Processor 2: Synopsys ARC EV7x • ARC EV7x Vision Processors include • Up to four enhanced vector processing units (VPUs) • DNN accelerator with up to 3520 MACs • Provides scalability for performance vs power tradeoffs • Designed for maximum power efficiency for neural network processing Licensable Licensable Vector Engine 1, 2 or 4 VPU configurations DNN Accelerator 880 to 3520 MAC configurations Trace Power Mgmt. Sync & Debug AXI Interfaces DMA Coherency Shared Memory Closely Coupled Memories MetaWare EV Development Toolkit OpenCL™ C, C/C++ Development Tools OpenCV, OpenVX™ Libraries & Runtime Simulators, Virtual Platforms NN SDK DMA VPU 4 VF P U VC C M Ca ch e VPU 3 VF P U VC C M Ca ch e VPU 2 VF P U VC C M Ca ch e VPU 1 512-bit vector DSP 32-bit scalar VFPU VCCM Cache Synopsys DesignWare ARC EV7x Processor Convolutions 2D Fully Connected Layers Activations
  • 13. © 2021 Synopsys Phase 2: Face Detection on ARC EV7x • Runs only after wake-up trigger event from Phase 1 • Can be real detection or a false positive • Eg: Once every 60 secs (very busy door-lock camera + some false detections) • Localize the face in the camera input frame • Eg: MobileNet-SSD graph with 416x416 input • Most complex phase – needs to execute efficiently • Use ARC EV7x DNN accelerator Face Detector NN Graph Face Detector NN Graph No Face Detected Wake up Signal Confirm there’s a face and find it in the frame
  • 14. © 2021 Synopsys Phase 3: Face Recognition on ARC EV7x • Executed after successful Phase 2 detection • Extract face embedding vectors suitable for database lookup • FaceNet (MobileNetv2-based) • Use Phase 2’s detection bounding box as input • Leverage ARC EV7x’s DNN accelerator • Possible to have both graphs loaded at same time Compute the “face embedding” vector Face Recognition NN Graph 128 entry embedding vector
  • 15. © 2021 Synopsys Facts • For both ARC EM and ARC EV, Phase 1 compute energy is not dominant • Phase 1 workload is exceedingly simple for the ARC EV processor • ARC EV will be idle most of time (only 4 inferences/second over 60 seconds) Options for idle ARC EV 1. Sleep mode -> Standby energy 2. Power down -> Boot up energy Question: Could we use ARC EV for Phase 1? Justifying the Phased Heterogenous Solution for Phase 1 Metric ARC EM always-on ARC EV sleep when idle ARC EV power down when idle Standby Energy Consumption 1x 12x 3x ARC EM is Highly Energy-Efficient for Always-On Tasks
  • 16. © 2021 Synopsys Facts • Phase 2 and 3 compute are complex workloads • ARC EM-based execution would not be real-time (or would need an unreasonable clock speed to maintain inference rate) • ARC EV provides significantly more compute parallelization so is more energy-efficient Question: Could we use ARC EM for Phase 2 and Phase 3? Justifying the Phased Heterogenous Solution for Phases 2 & 3 Compute time EM EV Phase 2 410x 1x Phase 3 185x 1x Energy EM EV Phase 2 12x 1x Phase 3 12x 1x ARC EV Offers Real-Time Performance and Power Efficiency for Complex Workloads
  • 17. © 2021 Synopsys Conclusion • Face Recognition is multi-step problem • For embedded environments, power minimization is key • Distribute the workload across heterogeneous cores • Combine strengths of low-power + high-performance cores to achieve energy-saving goals • Synopsys DesignWare IP offers scalable processors like the ARC EM and ARC EV7x families which are suitable for this work
  • 18. © 2021 Synopsys Resources Resources Power Efficient Facial Detection & Recognition with ARC Processors https://www.synopsys.com/designware-ip/technical-bulletin/face- recognition-detection-arc-ev.html Say Welcome to the Machine. Low-Power Machine Learning for Smart IoT Applications https://www.synopsys.com/dw/doc.php/wp/arc_low_power_machin e_learning_for_iot.pdf FDDB: A Benchmark for Face Detection in Unconstrained Settings http://vis-www.cs.umass.edu/fddb/fddb.pdf The History of Facial Recognition Technologies: How Image Recognition Got So Advanced https://anyconnect.com/blog/the-history-of-facial-recognition- technologies 18 2021 Embedded Vision Summit Demo 1: SR-GAN Super Resolution on DesignWare ARC EV7x Processors Demo 2: Simultaneous Localization and Mapping Acceleration (SLAM) on DesignWare ARC EV7x Processors View and Q&A: • May 27: 12:00 pm - 1:00 pm PT • May 28: 10:00 am - 11:00 am PT