SlideShare a Scribd company logo
http://bit.ly/dlai2018
Santiago Pascual de la Puente
santi.pascual@upc.edu
PhD Candidate
Universitat Politecnica de Catalunya
Technical University of Catalonia
Deep Generative Models I
Variational Autoencoders
#DLUPC
Outline
● What is in here?
● Introduction
● Taxonomy
● Variational Auto-Encoders (VAEs) (DGMs I)
● Generative Adversarial Networks (GANs)
● PixelCNN/Wavenet
● Normalizing Flows (and flow-based gen. models)
○ Real NVP
○ GLOW
● Models comparison
● Conclusions
2
What is in here?
Disclaimer
4
● We are going to make a shift in the modeling paradigm:
discriminative → generative.
● Is this a type of neural network? No → either fully connected
networks, convolutional networks or recurrent networks fit,
depending on how we condition data points (sequentially,
globally, etc.).
● This is a very fun topic with which we can make a network paint,
sing or write.
Planning for our generative trip
5
● 5/11/2018 → DGM I: Introduction to gen models + VAEs
○ (... in b/w ...) Methodology + RNN lessons
● 19/11/2018 → DGM II: GANs
● 19/11/2018 → DGM III: likelihood models (pixelCNN + flow models)
● 26/11/2018 → Practical lesson on DGMs (code day)
Introduction
What we are used to do with Neural Nets
7
0
1
0
input
Network
output
class
Figure credit: Javier Ruiz
Discriminative model → aka. tell me the probability of some ‘Y’ responses given ‘X’ inputs. Here we
don’t care about the process that generated ‘X’. we just detect some patterns in the input to give
an answer. This gives us some outcomes probabilities given some ‘X’ data: P(Y | X).
P(Y = [0,1,0] | X = [pixel1
, pixel2
, …, pixel784
])
What is a generative model?
8
We have datapoints that emerge from some
generating process (landscapes in the nature,
speech from people, etc.)
X = {x1
, x2
, …, xN
}
Having our dataset X with example datapoints
(images, waveforms, written text, etc.) each point xn
lays in some M-dimensional space.
What is a generative model?
9
We have datapoints that emerge from some
generating process (landscapes in the nature,
speech from people, etc.)
X = {x1
, x2
, …, xN
}
Having our dataset X with example datapoints
(images, waveforms, written text, etc.) each point xn
lays in some M-dimensional space.
Each point xn
comes from an M-dimensional
probability distribution P(X) → Model it!
What is a generative model?
10
We have datapoints that emerge from some generating
process (landscapes in the nature, speech from people,
etc.)
X = {x1
, x2
, …, xN
}
Y = {y1
, y2
, …, yN
}
Having our dataset X with example datapoints (images,
waveforms, written text, etc.) each point xn
lays in some
M-dimensional space. Each point yn
lays in some
K-dimensional space. M can be different than K.
DOG
CAT
TRUCK
PIZZA
THRILLER
SCI-FI
HISTORY
/aa/
/e/
/o/
We can also modeled joint probabilities to apply
conditioning variables on the generative
process: P(X, Y)
11
1) We want our model with parameters θ={weights, biases} to output samples
distributed Pmodel, matching the distribution of our training data Pdata or P(X).
2) We can sample points from Pmodel plausibly looking Pdata distributed.
What is a generative model?
12
1) We want our model with parameters θ={weights, biases} to output samples
distributed Pmodel, matching the distribution of our training data Pdata or
P(X).
2) We can sample points from Pmodel plausibly looking Pdata distributed.
What is a generative model?
We have not mentioned any network structure
or model input data format, just a requirement
on what we want in the output of our model.
13
We can generate any type of data, like speech waveform samples or image pixels.
What is a generative model?
M samples = M dimensions
32
32
.
.
.
Scan and unroll
x3 channels
M = 32x32x3 = 3072 dimensions
Our learned model should be able to make up new samples from the
distribution, not just copy and paste existing samples!
14
What is a generative model?
Figure from NIPS 2016 Tutorial: Generative Adversarial Networks (I. Goodfellow)
Why Generative Models?
● Model very complex and high-dimensional distributions.
● Be able to generate realistic synthetic samples
○ possibly perform data augmentation
○ simulate possible futures for learning algorithms
● Unsupervised learning: fill blanks in the data
● Manipulate real samples with the assistance of the generative model
○ Example: edit pictures with guidance (photoshop super pro level)
Motivating Applications
Image inpainting
17
Recover lost information/add enhancing details by learning the natural distribution of pixels.
original enhanced
Speech Enhancement
18
Recover lost information/add enhancing details by learning the natural distribution of audio samples.
original
enhanced
Speech Synthesis
19
Generate spontaneously new speech by learning its natural distribution along time.
Speech
synthesizer
Image Generation
20
Generate spontaneously new images by learning their spatial distribution.
Image
generator
Figure credit: I. Goodfellow
Super-resolution
21
Generate high resolution image version, even introducing made up plausible details.
(Ledig et al. 2016)
Generative Models Taxonomy
23
We will see models that learn the probability density function:
● Explicitly (we impose a known loss function)
○ (1) With approximate density → Variational
Auto-Encoders (VAEs)
○ (2) With tractable density & likelihood-based:
■ PixelCNN, Wavenet.
■ Flow-based models: real-NVP, GLOW.
● Implicitly (we “do not know” the loss function)
○ (3) Generative Adversarial Networks (GANs).
Taxonomy
Variational Auto-Encoders
Auto-Encoder Neural Network
Autoencoders:
● Predict at the output the
same input data.
● Do not need labels
cx x^
Encode Decode
25
Bottleneck
Auto-Encoder Neural Network
● Q: Is an AE a generative model? (How can we make it generate data?)
c
Encode Decode
“Generate” 26
Auto-Encoder Neural Network
● Q: Is an AE a generative model? (How can we make it generate data?)
○ A: This “just” memorizes codes C from our training samples!
c
Encode Decode
“Generate”
memorization
27
Variational Auto-Encoder
VAE intuitively:
● Introduce a restriction in z, such that our data points x (e.g. images)
are distributed in a latent space (manifold) following a specified
probability density function Z (normally N(0, I)).
z
Encode Decode
z ~ N(0, I)
28
VAE intuitively:
● Introduce a restriction in z, such that our data points x (e.g. images)
are distributed in a latent space (manifold) following a specified
probability density function Z (normally N(0, I)).
z
Encode Decode
z ~ N(0, I)
We can then sample z to
generate NEW x data
points (e.g. images).
Variational Auto-Encoder
29
Variational Auto-Encoder
● VAE, aka. where Bayesian theory and deep learning collide, in depth:
Fixed
parameters
Generate X,
N times
sample z,
N times
Deterministic
function 30Credit:
Variational Auto-Encoder
● VAE, aka. where Bayesian theory and deep learning collide, in depth:
Fixed
parameters
Generate X,
N times
sample z,
N times
Deterministic
function
Maximize the probability of
each X under the generative
process.
Maximum likelihood framework
Variational Auto-Encoder
Intuition behind normally distributed z vectors: any output distribution can be achieved from the simple
N(0, I) with powerful mappings.
N(0, I)
32
Variational Auto-Encoder
Intuition behind normally distributed z vectors: any output distribution can be achieved from the simple
N(0, I) with powerful mappings.
Who’s the strongest non-linear and learnable mapper in the universe (so far)?
33
Variational Auto-Encoder
● VAE, aka. where Bayesian theory and deep learning collide, in depth:
Fixed
parameters
Generate X,
N times
sample z,
N times
Deterministic
function
Maximize the probability of
each X under the generative
process.
Maximum likelihood framework
Variational Auto-Encoder
Now to solve the maximum likelihood problem… We’d like to know and . We
introduce as a key piece → sample values z likely to produce X, not just the whole
possibilities.
But is unkown too! Variational Inference comes in to play its role: approximate
with .
Key Idea behind the variational inference application: find an approximation
function that is good enough to represent the real one → optimization problem.
35
Variational Auto-Encoder
Neural network prespective
The approximated function starts to shape up as a neural encoder, going from training datapoints x to
the likely z points following , which in turn is similar to the real .
36Credit: Altosaar
What is a variational autoncoder? (Altosaar 2017)
Variational Auto-Encoder
Neural network prespective
The (latent→ data) mapping starts to shape up as a neural decoder, where we go from our sampled z
to the reconstruction, which can have a very complex distribution.
37Credit: Altosaar
Variational Auto-Encoder
Continuing with the encoder approximation , we compute the KL divergence with the true
distribution:
38
KL divergence
Credit: Kristiadis
Variational Auto-Encoder
Continuing with the encoder approximation , we compute the KL divergence with the true
distribution:
Bayes rule
39
Variational autoencoder: Intutition and implementation (Kristiadi 2017)
Variational Auto-Encoder
Continuing with the encoder approximation , we compute the KL divergence with the true
distribution:
Gets out of expectation for no
dependency over z.
40Credit: Kristiadi
Variational Auto-Encoder
Continuing with the encoder approximation , we compute the KL divergence with the true
distribution:
41Credit: Kristiadi
Variational Auto-Encoder
Continuing with the encoder approximation , we compute the KL divergence with the true
distribution:
A bit more rearranging with sign and grouping leads us to a new KL term between
and , thus the encoder approximate distribution and our prior. 42
Credit: Kristiadi
Variational Auto-Encoder
We finally reach the Variational AutoEncoder objective function.
43Credit: Kristiadi
Variational Auto-Encoder
We finally reach the Variational AutoEncoder objective function.
log likelihood of our data
44Credit: Kristiadi
Variational Auto-Encoder
We finally reach the Variational AutoEncoder objective function.
Not computable and non-negative (KL)
approximation error.
log likelihood of our data
45Credit: Kristiadi
Variational Auto-Encoder
We finally reach the Variational AutoEncoder objective function.
log likelihood of our data
Not computable and non-negative (KL)
approximation error.
46
Reconstruction loss of our data
given latent space → NEURAL
DECODER reconstruction loss!
Credit: Kristiadi
Variational Auto-Encoder
We finally reach the Variational AutoEncoder objective function.
log likelihood of our data
Not computable and non-negative (KL)
approximation error.
Reconstruction loss of our data
given latent space → NEURAL
DECODER reconstruction loss!
Regularization of our latent
representation → NEURAL
ENCODER projects over prior.
47Credit: Kristiadi
Variational Auto-Encoder
Now, we have to define shape to compute its divergence against the prior (i.e. to properly
condense x samples over the surface of z). Simplest way: distribute over normal distribution with
predicted moments: and .
This allows us to compute the KL-div with in a closed form!
48Credit: Kristiadi
Variational Auto-Encoder
z
Encode Decode
We can compose our encoder - decoder setup, and place our VAE losses to regularize and reconstruct.
49
Variational Auto-Encoder
z
Encode Decode
Reparameterization trick
But WAIT, how can we backprop through sampling of ? Not differentiable!
50
Variational Auto-Encoder
z
Reparameterization trick
Sample and operate with it, multiplying by and summing
51
Variational Auto-Encoder
z
Generative behavior
Q: How can we now generate new samples once the underlying generating distribution is learned?
52
Variational Auto-Encoder
z1
Generative behavior
Q: How can we now generate new samples once the underlying generating distribution is learned?
A: We can sample from our prior, for example, discarding the encoder path.
z2
z3
53
54
Variational Auto-Encoder
Walking around z manifold dimensions gives us spontaneous generation of samples with different
shapes, poses, identitites, lightning, etc..
Examples:
MNIST manifold: https://youtu.be/hgyB8RegAlQ
Face manifold: https://www.youtube.com/watch?v=XNZIN7Jh3Sg
55
Variational Auto-Encoder
Walking around z manifold dimensions gives us spontaneous generation of samples with different
shapes, poses, identitites, lightning, etc..
Example with MNIST manifold
56
Variational Auto-Encoder
Walking around z manifold dimensions gives us spontaneous generation of samples with different
shapes, poses, identitites, lightning, etc..
Example with Faces manifold
Variational Auto-Encoder
Code show with PyTorch on VAEs!
https://github.com/pytorch/examples/tree/master/vae
57
Variational Auto-Encoder
58
Loss
Model MNIST:Binary reconstruction case (BCE loss)
GANs are coming in DGMs II ...
The GAN epidemic
Thanks! Questions?
References
61
● NIPS 2016 Tutorial: Generative Adversarial Networks (Goodfellow 2016)
● Auto-Encoding Variational Bayes (Kingma & Welling 2013)
● https://wiseodd.github.io/techblog/2016/12/10/variational-autoencoder/
● https://jaan.io/what-is-variational-autoencoder-vae-tutorial/
● Tutorial on Variational Autoencoders (Doersch 2016)

More Related Content

PDF
Variational Autoencoders For Image Generation
Jason Anderson
 
PDF
Deep Learning for Computer Vision: Generative models and adversarial training...
Universitat Politècnica de Catalunya
 
PDF
Generative adversarial network and its applications to speech signal and natu...
宏毅 李
 
PDF
Deep Generative Learning for All
Universitat Politècnica de Catalunya
 
PDF
GANs and Applications
Hoang Nguyen
 
PPT
Intro to Deep learning - Autoencoders
Akash Goel
 
PDF
Introduction of VAE
Ken'ichi Matsui
 
PDF
Tutorial on Deep Generative Models
MLReview
 
Variational Autoencoders For Image Generation
Jason Anderson
 
Deep Learning for Computer Vision: Generative models and adversarial training...
Universitat Politècnica de Catalunya
 
Generative adversarial network and its applications to speech signal and natu...
宏毅 李
 
Deep Generative Learning for All
Universitat Politècnica de Catalunya
 
GANs and Applications
Hoang Nguyen
 
Intro to Deep learning - Autoencoders
Akash Goel
 
Introduction of VAE
Ken'ichi Matsui
 
Tutorial on Deep Generative Models
MLReview
 

What's hot (20)

PDF
Attention is All You Need (Transformer)
Jeong-Gwan Lee
 
PPTX
Transformers in Vision: From Zero to Hero
Bill Liu
 
PPTX
Word embedding
ShivaniChoudhary74
 
PDF
07 regularization
Ronald Teo
 
PDF
Generative adversarial networks
남주 김
 
PDF
Transfer Learning
Hichem Felouat
 
PPTX
Generative Adversarial Network (GANs).
kgandham169
 
PPTX
Generative Adversarial Networks (GAN)
Manohar Mukku
 
PDF
Transfer Learning: An overview
jins0618
 
PPTX
Ensemble learning
Haris Jamil
 
PDF
Transformer Introduction (Seminar Material)
Yuta Niki
 
PPTX
Attention Is All You Need
Illia Polosukhin
 
PPTX
Generative Adversarial Network (GAN)
Prakhar Rastogi
 
PPTX
Support vector machines (svm)
Sharayu Patil
 
PPTX
Transformers In Vision From Zero to Hero (DLI).pptx
Deep Learning Italia
 
PPTX
Introduction to Transformer Model
Nuwan Sriyantha Bandara
 
PDF
Support Vector Machines for Classification
Prakash Pimpale
 
PPTX
Regularization in deep learning
Kien Le
 
PPTX
Ensemble methods in machine learning
SANTHOSH RAJA M G
 
PPTX
Introduction to Deep learning
leopauly
 
Attention is All You Need (Transformer)
Jeong-Gwan Lee
 
Transformers in Vision: From Zero to Hero
Bill Liu
 
Word embedding
ShivaniChoudhary74
 
07 regularization
Ronald Teo
 
Generative adversarial networks
남주 김
 
Transfer Learning
Hichem Felouat
 
Generative Adversarial Network (GANs).
kgandham169
 
Generative Adversarial Networks (GAN)
Manohar Mukku
 
Transfer Learning: An overview
jins0618
 
Ensemble learning
Haris Jamil
 
Transformer Introduction (Seminar Material)
Yuta Niki
 
Attention Is All You Need
Illia Polosukhin
 
Generative Adversarial Network (GAN)
Prakhar Rastogi
 
Support vector machines (svm)
Sharayu Patil
 
Transformers In Vision From Zero to Hero (DLI).pptx
Deep Learning Italia
 
Introduction to Transformer Model
Nuwan Sriyantha Bandara
 
Support Vector Machines for Classification
Prakash Pimpale
 
Regularization in deep learning
Kien Le
 
Ensemble methods in machine learning
SANTHOSH RAJA M G
 
Introduction to Deep learning
leopauly
 
Ad

Similar to Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018 (20)

PDF
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
Universitat Politècnica de Catalunya
 
PPTX
GDC2019 - SEED - Towards Deep Generative Models in Game Development
Electronic Arts / DICE
 
PDF
VAE-type Deep Generative Models
Kenta Oono
 
PDF
Alberto Massidda - Scenes from a memory - Codemotion Rome 2019
Codemotion
 
PDF
Deep Generative Models
Chia-Wen Cheng
 
PDF
Deep Generative Modelling
Petko Nikolov
 
PDF
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
PDF
Deep image generating models
Luba Elliott
 
PDF
The Success of Deep Generative Models
inside-BigData.com
 
PPTX
Representation Learning & Generative Modeling with Variational Autoencoder(VA...
changedaeoh
 
PPTX
Variational Auto Encoder and the Math Behind
Varun Reddy
 
PDF
AI 로봇 아티스트의 비밀(창원대학교 정보통신공학과 특강)
Changwon National University
 
PDF
Deep Generative Modelling (updated)
Petko Nikolov
 
PDF
Tutorial on Theory and Application of Generative Adversarial Networks
MLReview
 
PPTX
GAN Deep Learning Approaches to Image Processing Applications (1).pptx
RMDAcademicCoordinat
 
PPTX
Introduction to Generative Models.pptx
JOBANPREETSINGH62
 
PDF
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Universitat Politècnica de Catalunya
 
PDF
GENAI GENAI GENAI GENAIGENAI GENAI GENAI GENAI
fahima312
 
PPTX
MODULE 6 - 1 VAE - Variational Autoencoder
DivyaMeenaS
 
PPTX
Variational autoencoder talk
Shai Harel
 
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
Universitat Politècnica de Catalunya
 
GDC2019 - SEED - Towards Deep Generative Models in Game Development
Electronic Arts / DICE
 
VAE-type Deep Generative Models
Kenta Oono
 
Alberto Massidda - Scenes from a memory - Codemotion Rome 2019
Codemotion
 
Deep Generative Models
Chia-Wen Cheng
 
Deep Generative Modelling
Petko Nikolov
 
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Deep image generating models
Luba Elliott
 
The Success of Deep Generative Models
inside-BigData.com
 
Representation Learning & Generative Modeling with Variational Autoencoder(VA...
changedaeoh
 
Variational Auto Encoder and the Math Behind
Varun Reddy
 
AI 로봇 아티스트의 비밀(창원대학교 정보통신공학과 특강)
Changwon National University
 
Deep Generative Modelling (updated)
Petko Nikolov
 
Tutorial on Theory and Application of Generative Adversarial Networks
MLReview
 
GAN Deep Learning Approaches to Image Processing Applications (1).pptx
RMDAcademicCoordinat
 
Introduction to Generative Models.pptx
JOBANPREETSINGH62
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Universitat Politècnica de Catalunya
 
GENAI GENAI GENAI GENAIGENAI GENAI GENAI GENAI
fahima312
 
MODULE 6 - 1 VAE - Variational Autoencoder
DivyaMeenaS
 
Variational autoencoder talk
Shai Harel
 
Ad

More from Universitat Politècnica de Catalunya (20)

PDF
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
Universitat Politècnica de Catalunya
 
PDF
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Universitat Politècnica de Catalunya
 
PDF
The Transformer - Xavier Giró - UPC Barcelona 2021
Universitat Politècnica de Catalunya
 
PDF
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Universitat Politècnica de Catalunya
 
PDF
Open challenges in sign language translation and production
Universitat Politècnica de Catalunya
 
PPTX
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Universitat Politècnica de Catalunya
 
PPTX
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Universitat Politècnica de Catalunya
 
PDF
Learn2Sign : Sign language recognition and translation using human keypoint e...
Universitat Politècnica de Catalunya
 
PDF
Intepretability / Explainable AI for Deep Neural Networks
Universitat Politècnica de Catalunya
 
PDF
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
PDF
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Universitat Politècnica de Catalunya
 
PDF
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
PDF
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Universitat Politècnica de Catalunya
 
PDF
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Universitat Politècnica de Catalunya
 
PDF
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Universitat Politècnica de Catalunya
 
PDF
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Universitat Politècnica de Catalunya
 
PDF
Curriculum Learning for Recurrent Video Object Segmentation
Universitat Politècnica de Catalunya
 
PDF
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Universitat Politècnica de Catalunya
 
PDF
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Universitat Politècnica de Catalunya
 
PDF
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Universitat Politècnica de Catalunya
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Universitat Politècnica de Catalunya
 
The Transformer - Xavier Giró - UPC Barcelona 2021
Universitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Universitat Politècnica de Catalunya
 
Open challenges in sign language translation and production
Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Universitat Politècnica de Catalunya
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Universitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Universitat Politècnica de Catalunya
 
Intepretability / Explainable AI for Deep Neural Networks
Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Universitat Politècnica de Catalunya
 
Curriculum Learning for Recurrent Video Object Segmentation
Universitat Politècnica de Catalunya
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Universitat Politècnica de Catalunya
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Universitat Politècnica de Catalunya
 
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Universitat Politècnica de Catalunya
 

Recently uploaded (20)

PDF
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 
PDF
SUMMER INTERNSHIP REPORT[1] (AutoRecovered) (6) (1).pdf
pandeydiksha814
 
PPTX
Introduction to Data Analytics and Data Science
KavithaCIT
 
PPTX
Employee Salary Presentation.l based on data science collection of data
barridevakumari2004
 
PPTX
INFO8116 - Week 10 - Slides.pptx big data architecture
guddipatel10
 
PDF
Practical Measurement Systems Analysis (Gage R&R) for design
Rob Schubert
 
PPTX
INFO8116 - Week 10 - Slides.pptx data analutics
guddipatel10
 
PDF
Company Presentation pada Perusahaan ADB.pdf
didikfahmi
 
PPTX
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
PPTX
Power BI in Business Intelligence with AI
KPR Institute of Engineering and Technology
 
PPTX
Data-Driven Machine Learning for Rail Infrastructure Health Monitoring
Sione Palu
 
PPTX
Future_of_AI_Presentation for everyone.pptx
boranamanju07
 
PPT
Grade 5 PPT_Science_Q2_W6_Methods of reproduction.ppt
AaronBaluyut
 
PPTX
short term project on AI Driven Data Analytics
JMJCollegeComputerde
 
PPTX
Multiscale Segmentation of Survey Respondents: Seeing the Trees and the Fores...
Sione Palu
 
PDF
Chad Readey - An Independent Thinker
Chad Readey
 
PDF
blockchain123456789012345678901234567890
tanvikhunt1003
 
PDF
Blue Futuristic Cyber Security Presentation.pdf
tanvikhunt1003
 
PPTX
Probability systematic sampling methods.pptx
PrakashRajput19
 
PPTX
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 
SUMMER INTERNSHIP REPORT[1] (AutoRecovered) (6) (1).pdf
pandeydiksha814
 
Introduction to Data Analytics and Data Science
KavithaCIT
 
Employee Salary Presentation.l based on data science collection of data
barridevakumari2004
 
INFO8116 - Week 10 - Slides.pptx big data architecture
guddipatel10
 
Practical Measurement Systems Analysis (Gage R&R) for design
Rob Schubert
 
INFO8116 - Week 10 - Slides.pptx data analutics
guddipatel10
 
Company Presentation pada Perusahaan ADB.pdf
didikfahmi
 
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
Power BI in Business Intelligence with AI
KPR Institute of Engineering and Technology
 
Data-Driven Machine Learning for Rail Infrastructure Health Monitoring
Sione Palu
 
Future_of_AI_Presentation for everyone.pptx
boranamanju07
 
Grade 5 PPT_Science_Q2_W6_Methods of reproduction.ppt
AaronBaluyut
 
short term project on AI Driven Data Analytics
JMJCollegeComputerde
 
Multiscale Segmentation of Survey Respondents: Seeing the Trees and the Fores...
Sione Palu
 
Chad Readey - An Independent Thinker
Chad Readey
 
blockchain123456789012345678901234567890
tanvikhunt1003
 
Blue Futuristic Cyber Security Presentation.pdf
tanvikhunt1003
 
Probability systematic sampling methods.pptx
PrakashRajput19
 
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 

Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018

  • 1. http://bit.ly/dlai2018 Santiago Pascual de la Puente [email protected] PhD Candidate Universitat Politecnica de Catalunya Technical University of Catalonia Deep Generative Models I Variational Autoencoders #DLUPC
  • 2. Outline ● What is in here? ● Introduction ● Taxonomy ● Variational Auto-Encoders (VAEs) (DGMs I) ● Generative Adversarial Networks (GANs) ● PixelCNN/Wavenet ● Normalizing Flows (and flow-based gen. models) ○ Real NVP ○ GLOW ● Models comparison ● Conclusions 2
  • 3. What is in here?
  • 4. Disclaimer 4 ● We are going to make a shift in the modeling paradigm: discriminative → generative. ● Is this a type of neural network? No → either fully connected networks, convolutional networks or recurrent networks fit, depending on how we condition data points (sequentially, globally, etc.). ● This is a very fun topic with which we can make a network paint, sing or write.
  • 5. Planning for our generative trip 5 ● 5/11/2018 → DGM I: Introduction to gen models + VAEs ○ (... in b/w ...) Methodology + RNN lessons ● 19/11/2018 → DGM II: GANs ● 19/11/2018 → DGM III: likelihood models (pixelCNN + flow models) ● 26/11/2018 → Practical lesson on DGMs (code day)
  • 7. What we are used to do with Neural Nets 7 0 1 0 input Network output class Figure credit: Javier Ruiz Discriminative model → aka. tell me the probability of some ‘Y’ responses given ‘X’ inputs. Here we don’t care about the process that generated ‘X’. we just detect some patterns in the input to give an answer. This gives us some outcomes probabilities given some ‘X’ data: P(Y | X). P(Y = [0,1,0] | X = [pixel1 , pixel2 , …, pixel784 ])
  • 8. What is a generative model? 8 We have datapoints that emerge from some generating process (landscapes in the nature, speech from people, etc.) X = {x1 , x2 , …, xN } Having our dataset X with example datapoints (images, waveforms, written text, etc.) each point xn lays in some M-dimensional space.
  • 9. What is a generative model? 9 We have datapoints that emerge from some generating process (landscapes in the nature, speech from people, etc.) X = {x1 , x2 , …, xN } Having our dataset X with example datapoints (images, waveforms, written text, etc.) each point xn lays in some M-dimensional space. Each point xn comes from an M-dimensional probability distribution P(X) → Model it!
  • 10. What is a generative model? 10 We have datapoints that emerge from some generating process (landscapes in the nature, speech from people, etc.) X = {x1 , x2 , …, xN } Y = {y1 , y2 , …, yN } Having our dataset X with example datapoints (images, waveforms, written text, etc.) each point xn lays in some M-dimensional space. Each point yn lays in some K-dimensional space. M can be different than K. DOG CAT TRUCK PIZZA THRILLER SCI-FI HISTORY /aa/ /e/ /o/ We can also modeled joint probabilities to apply conditioning variables on the generative process: P(X, Y)
  • 11. 11 1) We want our model with parameters θ={weights, biases} to output samples distributed Pmodel, matching the distribution of our training data Pdata or P(X). 2) We can sample points from Pmodel plausibly looking Pdata distributed. What is a generative model?
  • 12. 12 1) We want our model with parameters θ={weights, biases} to output samples distributed Pmodel, matching the distribution of our training data Pdata or P(X). 2) We can sample points from Pmodel plausibly looking Pdata distributed. What is a generative model? We have not mentioned any network structure or model input data format, just a requirement on what we want in the output of our model.
  • 13. 13 We can generate any type of data, like speech waveform samples or image pixels. What is a generative model? M samples = M dimensions 32 32 . . . Scan and unroll x3 channels M = 32x32x3 = 3072 dimensions
  • 14. Our learned model should be able to make up new samples from the distribution, not just copy and paste existing samples! 14 What is a generative model? Figure from NIPS 2016 Tutorial: Generative Adversarial Networks (I. Goodfellow)
  • 15. Why Generative Models? ● Model very complex and high-dimensional distributions. ● Be able to generate realistic synthetic samples ○ possibly perform data augmentation ○ simulate possible futures for learning algorithms ● Unsupervised learning: fill blanks in the data ● Manipulate real samples with the assistance of the generative model ○ Example: edit pictures with guidance (photoshop super pro level)
  • 17. Image inpainting 17 Recover lost information/add enhancing details by learning the natural distribution of pixels. original enhanced
  • 18. Speech Enhancement 18 Recover lost information/add enhancing details by learning the natural distribution of audio samples. original enhanced
  • 19. Speech Synthesis 19 Generate spontaneously new speech by learning its natural distribution along time. Speech synthesizer
  • 20. Image Generation 20 Generate spontaneously new images by learning their spatial distribution. Image generator Figure credit: I. Goodfellow
  • 21. Super-resolution 21 Generate high resolution image version, even introducing made up plausible details. (Ledig et al. 2016)
  • 23. 23 We will see models that learn the probability density function: ● Explicitly (we impose a known loss function) ○ (1) With approximate density → Variational Auto-Encoders (VAEs) ○ (2) With tractable density & likelihood-based: ■ PixelCNN, Wavenet. ■ Flow-based models: real-NVP, GLOW. ● Implicitly (we “do not know” the loss function) ○ (3) Generative Adversarial Networks (GANs). Taxonomy
  • 25. Auto-Encoder Neural Network Autoencoders: ● Predict at the output the same input data. ● Do not need labels cx x^ Encode Decode 25 Bottleneck
  • 26. Auto-Encoder Neural Network ● Q: Is an AE a generative model? (How can we make it generate data?) c Encode Decode “Generate” 26
  • 27. Auto-Encoder Neural Network ● Q: Is an AE a generative model? (How can we make it generate data?) ○ A: This “just” memorizes codes C from our training samples! c Encode Decode “Generate” memorization 27
  • 28. Variational Auto-Encoder VAE intuitively: ● Introduce a restriction in z, such that our data points x (e.g. images) are distributed in a latent space (manifold) following a specified probability density function Z (normally N(0, I)). z Encode Decode z ~ N(0, I) 28
  • 29. VAE intuitively: ● Introduce a restriction in z, such that our data points x (e.g. images) are distributed in a latent space (manifold) following a specified probability density function Z (normally N(0, I)). z Encode Decode z ~ N(0, I) We can then sample z to generate NEW x data points (e.g. images). Variational Auto-Encoder 29
  • 30. Variational Auto-Encoder ● VAE, aka. where Bayesian theory and deep learning collide, in depth: Fixed parameters Generate X, N times sample z, N times Deterministic function 30Credit:
  • 31. Variational Auto-Encoder ● VAE, aka. where Bayesian theory and deep learning collide, in depth: Fixed parameters Generate X, N times sample z, N times Deterministic function Maximize the probability of each X under the generative process. Maximum likelihood framework
  • 32. Variational Auto-Encoder Intuition behind normally distributed z vectors: any output distribution can be achieved from the simple N(0, I) with powerful mappings. N(0, I) 32
  • 33. Variational Auto-Encoder Intuition behind normally distributed z vectors: any output distribution can be achieved from the simple N(0, I) with powerful mappings. Who’s the strongest non-linear and learnable mapper in the universe (so far)? 33
  • 34. Variational Auto-Encoder ● VAE, aka. where Bayesian theory and deep learning collide, in depth: Fixed parameters Generate X, N times sample z, N times Deterministic function Maximize the probability of each X under the generative process. Maximum likelihood framework
  • 35. Variational Auto-Encoder Now to solve the maximum likelihood problem… We’d like to know and . We introduce as a key piece → sample values z likely to produce X, not just the whole possibilities. But is unkown too! Variational Inference comes in to play its role: approximate with . Key Idea behind the variational inference application: find an approximation function that is good enough to represent the real one → optimization problem. 35
  • 36. Variational Auto-Encoder Neural network prespective The approximated function starts to shape up as a neural encoder, going from training datapoints x to the likely z points following , which in turn is similar to the real . 36Credit: Altosaar What is a variational autoncoder? (Altosaar 2017)
  • 37. Variational Auto-Encoder Neural network prespective The (latent→ data) mapping starts to shape up as a neural decoder, where we go from our sampled z to the reconstruction, which can have a very complex distribution. 37Credit: Altosaar
  • 38. Variational Auto-Encoder Continuing with the encoder approximation , we compute the KL divergence with the true distribution: 38 KL divergence Credit: Kristiadis
  • 39. Variational Auto-Encoder Continuing with the encoder approximation , we compute the KL divergence with the true distribution: Bayes rule 39 Variational autoencoder: Intutition and implementation (Kristiadi 2017)
  • 40. Variational Auto-Encoder Continuing with the encoder approximation , we compute the KL divergence with the true distribution: Gets out of expectation for no dependency over z. 40Credit: Kristiadi
  • 41. Variational Auto-Encoder Continuing with the encoder approximation , we compute the KL divergence with the true distribution: 41Credit: Kristiadi
  • 42. Variational Auto-Encoder Continuing with the encoder approximation , we compute the KL divergence with the true distribution: A bit more rearranging with sign and grouping leads us to a new KL term between and , thus the encoder approximate distribution and our prior. 42 Credit: Kristiadi
  • 43. Variational Auto-Encoder We finally reach the Variational AutoEncoder objective function. 43Credit: Kristiadi
  • 44. Variational Auto-Encoder We finally reach the Variational AutoEncoder objective function. log likelihood of our data 44Credit: Kristiadi
  • 45. Variational Auto-Encoder We finally reach the Variational AutoEncoder objective function. Not computable and non-negative (KL) approximation error. log likelihood of our data 45Credit: Kristiadi
  • 46. Variational Auto-Encoder We finally reach the Variational AutoEncoder objective function. log likelihood of our data Not computable and non-negative (KL) approximation error. 46 Reconstruction loss of our data given latent space → NEURAL DECODER reconstruction loss! Credit: Kristiadi
  • 47. Variational Auto-Encoder We finally reach the Variational AutoEncoder objective function. log likelihood of our data Not computable and non-negative (KL) approximation error. Reconstruction loss of our data given latent space → NEURAL DECODER reconstruction loss! Regularization of our latent representation → NEURAL ENCODER projects over prior. 47Credit: Kristiadi
  • 48. Variational Auto-Encoder Now, we have to define shape to compute its divergence against the prior (i.e. to properly condense x samples over the surface of z). Simplest way: distribute over normal distribution with predicted moments: and . This allows us to compute the KL-div with in a closed form! 48Credit: Kristiadi
  • 49. Variational Auto-Encoder z Encode Decode We can compose our encoder - decoder setup, and place our VAE losses to regularize and reconstruct. 49
  • 50. Variational Auto-Encoder z Encode Decode Reparameterization trick But WAIT, how can we backprop through sampling of ? Not differentiable! 50
  • 51. Variational Auto-Encoder z Reparameterization trick Sample and operate with it, multiplying by and summing 51
  • 52. Variational Auto-Encoder z Generative behavior Q: How can we now generate new samples once the underlying generating distribution is learned? 52
  • 53. Variational Auto-Encoder z1 Generative behavior Q: How can we now generate new samples once the underlying generating distribution is learned? A: We can sample from our prior, for example, discarding the encoder path. z2 z3 53
  • 54. 54 Variational Auto-Encoder Walking around z manifold dimensions gives us spontaneous generation of samples with different shapes, poses, identitites, lightning, etc.. Examples: MNIST manifold: https://youtu.be/hgyB8RegAlQ Face manifold: https://www.youtube.com/watch?v=XNZIN7Jh3Sg
  • 55. 55 Variational Auto-Encoder Walking around z manifold dimensions gives us spontaneous generation of samples with different shapes, poses, identitites, lightning, etc.. Example with MNIST manifold
  • 56. 56 Variational Auto-Encoder Walking around z manifold dimensions gives us spontaneous generation of samples with different shapes, poses, identitites, lightning, etc.. Example with Faces manifold
  • 57. Variational Auto-Encoder Code show with PyTorch on VAEs! https://github.com/pytorch/examples/tree/master/vae 57
  • 58. Variational Auto-Encoder 58 Loss Model MNIST:Binary reconstruction case (BCE loss)
  • 59. GANs are coming in DGMs II ... The GAN epidemic
  • 61. References 61 ● NIPS 2016 Tutorial: Generative Adversarial Networks (Goodfellow 2016) ● Auto-Encoding Variational Bayes (Kingma & Welling 2013) ● https://wiseodd.github.io/techblog/2016/12/10/variational-autoencoder/ ● https://jaan.io/what-is-variational-autoencoder-vae-tutorial/ ● Tutorial on Variational Autoencoders (Doersch 2016)