SlideShare a Scribd company logo
Deep Learning:
concepts and use cases
Julien Simon
Principal Technical Evangelist, AI and Machine Learning, AWS
@julsimon
October 2018
What to expect
• An introduction to Deep Learning theory
• Neurons & Neural Networks
• The Training Process
• Backpropagation
• Optimizers
• Common network architectures and use cases
• Convolutional Neural Networks
• Recurrent Neural Networks
• Long Short Term Memory Networks
• Generative Adversarial Networks
• Getting started
• Artificial Intelligence: design software applications which
exhibit human-like behavior, e.g. speech, natural language
processing, reasoning or intuition
• Machine Learning: using statistical algorithms, teach
machines to learn from featurized data without being
explicitly programmed
• Deep Learning: using neural networks, teach machines to
learn from complex data where features cannot be
explicitly expressed
An introduction
to Deep Learning theory
Activation functionsThe neuron
!
"#$
%
xi ∗ wi + b = u
”Multiply and Accumulate”
Source: Wikipedia
bias
x =
x11, x12, …. x1I
x21, x22, …. x2I
… … …
xm1, xm2, …. xmI
I features
m samples
y =
2
0
…
4
m labels,
N2 categories
0,0,1,0,0,…,0
1,0,0,0,0,…,0
…
0,0,0,0,1,…,0
One-hot encoding
Neural networks
B u i l d i n g a s i m p l e c l a s s i f i e r
Biases are ignored for the rest of this discussion
x =
x11, x12, …. x1I
x21, x22, …. x2I
… … …
xm1, xm2, …. xmI
I features
m samples
y =
2
0
…
4
m labels,
N2 categories
Total number of predictions
Accuracy =
Number of correct predictions
0,0,1,0,0,…,0
1,0,0,0,0,…,0
…
0,0,0,0,1,…,0
One-hot encoding
Neural networks
B u i l d i n g a s i m p l e c l a s s i f i e r
Initially, the network will not predict correctly
f(X1) = Y’1
A loss function measures the difference between
the real label Y1 and the predicted label Y’1
error = loss(Y1, Y’1)
For a batch of samples:
!
"#$
%&'() *"+,
loss(Yi, Y’i) = batch error
The purpose of the training process is to
minimize error by gradually adjusting weights.
Neural networks
B u i l d i n g a s i m p l e c l a s s i f i e r
Mini-batch Training
Training data set Training
Trained
neural network
Batch size
Learning rate
Number of epochs
Hyper parameters
Backpropagation
Forward propagation
Validation
Validation data set
(also called dev set)
Neural network
in training
Validation
accuracy
Prediction at
the end of
each epoch
This data set must have the same distribution as real-life samples,
or else validation accuracy won’t reflect real-life accuracy.
Test
Test data set Fully trained
neural network
Test accuracy
Prediction at
the end of
experimentation
This data set must have the same distribution as real-life samples,
or else test accuracy won’t reflect real-life accuracy.
Stochastic Gradient Descent (1951)
Imagine you stand on top of a mountain (…).
You want to get down to the valley as quickly as
possible, but there is fog and you can only see
your immediate surroundings. How can you get
down the mountain as quickly as possible?
You look around and identify the steepest path
down, go down that path for a bit, again look
around and find the new steepest path, go down
that path, and repeat—this is exactly what
gradient descent does.
Tim Dettmers, University of Lugano, 2015
https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/
The « step size » depends on
the learning rate
z=f(x,y)
Finding the slope with Derivatives
Source: Wikipedia, Oklahoma State University, Khan Academy
End-to-end example of computing
backpropagation with partial derivatives:
https://mattmazur.com/2015/03/17/a-step-by-step-
backpropagation-example
Local minima and saddle points
« Do neural networks enter and
escape a series of local minima? Do
they move at varying speed as they
approach and then pass a variety of
saddle points? Answering these
questions definitively is difficult, but
we present evidence strongly
suggesting that the answer to all of
these questions is no. »
« Qualitatively characterizing neural network
optimization problems », Goodfellow et al,
2015 https://arxiv.org/abs/1412.6544
Optimizers
https://medium.com/@julsimon/tumbling-down-the-sgd-rabbit-hole-part-1-740fa402f0d7
SGD works remarkably
well and is still widely
used.
Adaptative optimizers use
a variable learning rate.
Some even use a learning
rate per dimension
(Adam).
Early stopping
Training accuracy
Loss function
Accuracy
100%
Epochs
Validation accuracy
Loss
Best epoch
OVERFITTING
« Deep Learning ultimately is about finding a minimum
that generalizes well, with bonus points for finding one
fast and reliably », Sebastian Ruder
Common network architectures
and use cases
Fully Connected Networks are nice, but…
• What if we need lots of layers in order to extract complex features?
• The number of parameters increases very quickly with the number of layers
• Overfitting is a constant problem
• What about large data?
• 256x256 images = 65,535 input neurons ?
• What about 2D/3D data ? Won’t we lose lots of info by flattening it?
• Images, videos, etc.
• What about sequential data, where the order of samples is
important?
• Translating text
• Predicting time series
Convolutional Neural Networks
Convolutional Neural Networks (CNN)
Le Cun, 1998: handwritten digit recognition, 32x32 pixels
https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/
Source: http://timdettmers.com
Extracting features with convolution
Convolution extracts features automatically.
Kernel parameters are learned during the training process.
Downsampling images with pooling
Source: Stanford University
Pooling shrinks images while preserving significant information.
Classification, detection, segmentation
https://github.com/dmlc/gluon-cv
Based on models published in 2015-2017
[electric_guitar],
with probability 0.671
Gluon
Face Detection
https://github.com/tornadomeet/mxnet-face
Based on models published 2015-2016
https://github.com/deepinsight/insightface
https://arxiv.org/abs/1801.07698
January 2018
Face Recognition
LFW 99.80%+
Megaface 98%+
with a single model
MXNetMXNet
Keras Image Inpainting
https://github.com/MathiasGruber/PConv-Keras
https://arxiv.org/abs/1804.07723
April 2018
Real-Time Pose Estimation
https://github.com/dragonfly90/mxnet_Realtime_Multi-Person_Pose_Estimation
November 2016
MXNet
Caffe 2 Real-Time Pose Estimation: DensePose
https://github.com/facebookresearch/DensePose
February 2018
Recurrent Neural Networks
Recurrent Neural Networks (RNN)
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
Image
captioning
Sentiment
analysis
Machine
translation
Video frame
labeling
Recurrent Neural Networks
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Long Short Term Memory Networks (LSTM)
Hochreiter and Schmidhuber,1997
• A LSTM neuron computes the
output based on the input and a
previous state
• LSTM neurons have « short-term
memory »
• They do a better job than RNN at
predicting longer sequences of data
Machine Translation – AWS Sockeye
https://github.com/awslabs/sockeye
MXNet
OCR – Tesseract 4.0 (beta)
https://github.com/tesseract-ocr/tesseract/wiki/NeuralNetsInTesseract4.00
https://www.learnopencv.com/deep-learning-based-text-recognition-ocr-using-tesseract-and-opencv/
Generative Adversarial Networks
Generative Adversarial Networks
Goodfellow, 2014 https://arxiv.org/abs/1406.2661
https://medium.com/@julsimon/generative-adversarial-networks-on-apache-mxnet-part-1-b6d39e6b5df1
Generator
Building images
from random vectors
Detector
Learning to detect real samples
from generated ones
Gradient updates
GAN: Welcome to the (un)real world, Neo
Generating new ”celebrity” faces
https://github.com/tkarras/progressive_growing_of_gans
April 2018
From semantic map to 2048x1024 picture
https://tcwang0509.github.io/pix2pixHD/
November 2017
TF
PyTorch
GAN: Everybody dance now
https://arxiv.org/abs/1808.07371
https://www.youtube.com/watch?v=PCBTZh41Ris
August 2018
Getting started
Resources
http://www.deeplearningbook.org/
https://gluon.mxnet.io
https://keras.io
https://medium.com/@julsimon
https://gitlab.com/juliensimon/{aws,dlnotebooks}
Deep Learning: concepts and use cases (October 2018)
Thank you!
Julien Simon
Principal Technical Evangelist, AI and Machine Learning, AWS
@julsimon

More Related Content

What's hot (20)

PDF
Deep Learning And Business Models (VNITC 2015-09-13)
Ha Phuong
 
PPTX
Deep learning
Ratnakar Pandey
 
PDF
Neural networks and deep learning
Jörgen Sandig
 
PDF
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
Poo Kuan Hoong
 
PPTX
Introduction to Deep Learning
Oswald Campesato
 
PDF
Language translation with Deep Learning (RNN) with TensorFlow
S N
 
PDF
Neural Networks and Deep Learning
Asim Jalis
 
PPTX
Promises of Deep Learning
David Khosid
 
PDF
Tutorial on Deep Learning
inside-BigData.com
 
PDF
Advance deep learning
aliaKhan71
 
PPTX
Deep Learning Tutorial
Amr Rashed
 
PDF
From Conventional Machine Learning to Deep Learning and Beyond.pptx
Chun-Hao Chang
 
PPTX
An introduction to Deep Learning
David Rostcheck
 
PDF
Machine learning for_finance
Stefan Duprey
 
PPTX
Deep Learning With Neural Networks
Aniket Maurya
 
PDF
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
Edge AI and Vision Alliance
 
PDF
Deep learning: Cutting through the Myths and Hype
Siby Jose Plathottam
 
PDF
Deep learning - A Visual Introduction
Lukas Masuch
 
PDF
Introduction to Deep learning
Massimiliano Ruocco
 
PDF
Artificial Collective Intelligence
Jun Wang
 
Deep Learning And Business Models (VNITC 2015-09-13)
Ha Phuong
 
Deep learning
Ratnakar Pandey
 
Neural networks and deep learning
Jörgen Sandig
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
Poo Kuan Hoong
 
Introduction to Deep Learning
Oswald Campesato
 
Language translation with Deep Learning (RNN) with TensorFlow
S N
 
Neural Networks and Deep Learning
Asim Jalis
 
Promises of Deep Learning
David Khosid
 
Tutorial on Deep Learning
inside-BigData.com
 
Advance deep learning
aliaKhan71
 
Deep Learning Tutorial
Amr Rashed
 
From Conventional Machine Learning to Deep Learning and Beyond.pptx
Chun-Hao Chang
 
An introduction to Deep Learning
David Rostcheck
 
Machine learning for_finance
Stefan Duprey
 
Deep Learning With Neural Networks
Aniket Maurya
 
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
Edge AI and Vision Alliance
 
Deep learning: Cutting through the Myths and Hype
Siby Jose Plathottam
 
Deep learning - A Visual Introduction
Lukas Masuch
 
Introduction to Deep learning
Massimiliano Ruocco
 
Artificial Collective Intelligence
Jun Wang
 

Similar to Deep Learning: concepts and use cases (October 2018) (20)

PDF
An Introduction to Deep Learning (May 2018)
Julien SIMON
 
PPTX
Deep Learning for Developers
Julien SIMON
 
PDF
An Introduction to Deep Learning (March 2018)
Julien SIMON
 
PPTX
Deep Learning for Developers (expanded version, 12/2017)
Julien SIMON
 
PPTX
Deep Learning with Apache MXNet (September 2017)
Julien SIMON
 
PDF
Apache MXNet ODSC West 2018
Apache MXNet
 
PDF
Separating Hype from Reality in Deep Learning with Sameer Farooqui
Databricks
 
PDF
Deep Dive on Deep Learning (June 2018)
Julien SIMON
 
PPTX
08 neural networks
ankit_ppt
 
PPTX
Deep Learning
Pawan Singh
 
PDF
Hardware Acceleration for Machine Learning
CastLabKAIST
 
PDF
#7 Neural Networks Artificial intelligence
MustansarAli20
 
PPTX
Introduction to deep learning
Abhishek Bhandwaldar
 
PPTX
Deep Learning Sample Class (Jon Lederman)
Jon Lederman
 
PDF
Scaling Deep Learning with MXNet
AI Frontiers
 
PDF
Neural network book. Interesting and precise
ShilpaMaratheSardesa
 
PPTX
Deep learning with TensorFlow
Barbara Fusinska
 
PDF
Deep Learning Study _ FInalwithCNN_RNN_LSTM_GRU.pdf
naveenraghavendran10
 
PPTX
Tsinghua invited talk_zhou_xing_v2r0
Joe Xing
 
PPTX
Batch normalization presentation
Owin Will
 
An Introduction to Deep Learning (May 2018)
Julien SIMON
 
Deep Learning for Developers
Julien SIMON
 
An Introduction to Deep Learning (March 2018)
Julien SIMON
 
Deep Learning for Developers (expanded version, 12/2017)
Julien SIMON
 
Deep Learning with Apache MXNet (September 2017)
Julien SIMON
 
Apache MXNet ODSC West 2018
Apache MXNet
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
Databricks
 
Deep Dive on Deep Learning (June 2018)
Julien SIMON
 
08 neural networks
ankit_ppt
 
Deep Learning
Pawan Singh
 
Hardware Acceleration for Machine Learning
CastLabKAIST
 
#7 Neural Networks Artificial intelligence
MustansarAli20
 
Introduction to deep learning
Abhishek Bhandwaldar
 
Deep Learning Sample Class (Jon Lederman)
Jon Lederman
 
Scaling Deep Learning with MXNet
AI Frontiers
 
Neural network book. Interesting and precise
ShilpaMaratheSardesa
 
Deep learning with TensorFlow
Barbara Fusinska
 
Deep Learning Study _ FInalwithCNN_RNN_LSTM_GRU.pdf
naveenraghavendran10
 
Tsinghua invited talk_zhou_xing_v2r0
Joe Xing
 
Batch normalization presentation
Owin Will
 
Ad

More from Julien SIMON (20)

PDF
deep_dive_multihead_latent_attention.pdf
Julien SIMON
 
PDF
Deep Dive: Model Distillation with DistillKit
Julien SIMON
 
PDF
Deep Dive: Parameter-Efficient Model Adaptation with LoRA and Spectrum
Julien SIMON
 
PDF
Building High-Quality Domain-Specific Models with Mergekit
Julien SIMON
 
PDF
Tailoring Small Language Models for Enterprise Use Cases
Julien SIMON
 
PDF
Tailoring Small Language Models for Enterprise Use Cases
Julien SIMON
 
PDF
Julien Simon - Deep Dive: Compiling Deep Learning Models
Julien SIMON
 
PDF
Tailoring Small Language Models for Enterprise Use Cases
Julien SIMON
 
PDF
Julien Simon - Deep Dive - Optimizing LLM Inference
Julien SIMON
 
PDF
Julien Simon - Deep Dive - Accelerating Models with Better Attention Layers
Julien SIMON
 
PDF
Julien Simon - Deep Dive - Quantizing LLMs
Julien SIMON
 
PDF
Julien Simon - Deep Dive - Model Merging
Julien SIMON
 
PDF
An introduction to computer vision with Hugging Face
Julien SIMON
 
PDF
Reinventing Deep Learning
 with Hugging Face Transformers
Julien SIMON
 
PDF
Building NLP applications with Transformers
Julien SIMON
 
PPTX
Building Machine Learning Models Automatically (June 2020)
Julien SIMON
 
PDF
Starting your AI/ML project right (May 2020)
Julien SIMON
 
PPTX
Scale Machine Learning from zero to millions of users (April 2020)
Julien SIMON
 
PPTX
An Introduction to Generative Adversarial Networks (April 2020)
Julien SIMON
 
PPTX
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
Julien SIMON
 
deep_dive_multihead_latent_attention.pdf
Julien SIMON
 
Deep Dive: Model Distillation with DistillKit
Julien SIMON
 
Deep Dive: Parameter-Efficient Model Adaptation with LoRA and Spectrum
Julien SIMON
 
Building High-Quality Domain-Specific Models with Mergekit
Julien SIMON
 
Tailoring Small Language Models for Enterprise Use Cases
Julien SIMON
 
Tailoring Small Language Models for Enterprise Use Cases
Julien SIMON
 
Julien Simon - Deep Dive: Compiling Deep Learning Models
Julien SIMON
 
Tailoring Small Language Models for Enterprise Use Cases
Julien SIMON
 
Julien Simon - Deep Dive - Optimizing LLM Inference
Julien SIMON
 
Julien Simon - Deep Dive - Accelerating Models with Better Attention Layers
Julien SIMON
 
Julien Simon - Deep Dive - Quantizing LLMs
Julien SIMON
 
Julien Simon - Deep Dive - Model Merging
Julien SIMON
 
An introduction to computer vision with Hugging Face
Julien SIMON
 
Reinventing Deep Learning
 with Hugging Face Transformers
Julien SIMON
 
Building NLP applications with Transformers
Julien SIMON
 
Building Machine Learning Models Automatically (June 2020)
Julien SIMON
 
Starting your AI/ML project right (May 2020)
Julien SIMON
 
Scale Machine Learning from zero to millions of users (April 2020)
Julien SIMON
 
An Introduction to Generative Adversarial Networks (April 2020)
Julien SIMON
 
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
Julien SIMON
 
Ad

Recently uploaded (20)

PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
July Patch Tuesday
Ivanti
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
July Patch Tuesday
Ivanti
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 

Deep Learning: concepts and use cases (October 2018)