SlideShare a Scribd company logo
A Simple SGVB
(Stochastic Gradient Variational Bayes)
for the CTM
(Correlated Topic Model)
Tomonari MASADA (正田备也)
Nagasaki University (长崎大学)
masada@nagasaki-u.ac.jp
APWeb 2016 @ Suzhou
Aim
•Make an informative summary of
large document sets by
•extracting word lists, each relating to
a different and particular topic.
 Topic modeling
2
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
Contribution
•We propose a new posterior estimation for
the correlated topic model (CTM) [Blei+ 07],
•an extension of LDA [Blei+ 03] for modeling
topic correlations,
•with stochastic gradient variational Bayes
(SGVB) [Kingma+ 14].
4
LDA [Blei+ 03]
•Clustering word tokens by assigning each word token to
one among the 𝐾 topics.
• 𝑧 𝑑𝑖: To which topic is the 𝑖-th word token in document 𝑑 is
assigned?
• 𝜃 𝑑𝑘: How often is the topic 𝑘 talked about in document 𝑑?
• Multinomial distribution for each 𝑑
• 𝜙 𝑘𝑣: How often is the word 𝑣 used to talk about the topic 𝑘?
• Multinomial distribution for each 𝑘
discrete variables
continuous variables
5
CTM [Blei+ 05]
•Clustering word tokens by assigning each word token to
one among the 𝐾 topics.
• 𝑧 𝑑𝑖: To which topic is the 𝑖-th word token in document 𝑑 is
assigned?
• 𝜃 𝑑𝑘: How often is the topic 𝑘 talked about in document 𝑑?
• 𝜽 𝑑 = 𝑓 𝜼 𝑑 where 𝜼 𝑑~𝑁 𝝁, 𝚺 (logistic normal distribution)
• 𝜙 𝑘𝑣: How often is the word 𝑣 used to talk about the topic 𝑘?
• Multinomial distribution for each 𝑘
discrete variables
continuous variables
6
Variational Bayes
Maximization of ELBO (evidence lower bound)
•VB (variational Bayes) approximates the true posterior.
•An approximate posterior is introduced when ELBO is
obtained by Jensen's inequality:
• 𝒛: discrete hidden variables (topic assignments)
• 𝚯: continuous hidden variables (multinomial parameters)
7
log evidence approximate posterior 𝑞(𝒛, 𝚯)
Factorization assumption
•We assume the approximate posterior 𝑞 𝒛, 𝚯
factorizes as 𝑞 𝒛 𝑞 𝚯 .
•Then ELBO can be written as
8
×discrete continuous
SGVB
[Kingma+ 14]
•SGVB (stochastic gradient variational Bayes) is a
general framework for estimating ELBO in
VB.
•SGVB is only applicable to continuous
distributions 𝑞 𝚯 .
•Monte Carlo integration for expectation
9
Reparameterization
•We use the diagonal logistic normal for
approximating the true posterior of 𝜽 𝑑.
•We can efficiently sample from the logistic
normal with reparameterization.
10
Monte Carlo integration
•ELBO is estimated with a sample from the
approximate posterior.
• The discrete part 𝑞 𝒛 is estimated as in the original VB. 11
Parameter updates
No explicit inversion (only Cholesky factorization)
12
"Stochastic" gradient
•The expectation integrations are estimated
by Monte Carlo method.
•The derivatives of ELBO depend on samples.
•Randomness is incorporated into the
maximization of ELBO.
•Does this make it easier to avoid local minima?
13
Data sets
# docs # word types
NYT 149,890 46,650
MOVIE 27,859 62,408
NSF 128,818 21,471
MED 125,490 42,83014
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
Conclusion
•We incorporate randomness into the
posterior inference for the CTM by
using SGVB.
•The proposed method gives perplexities
comparable to those achieved by LDA.
19
Pro/Con
•No explicit inversion of covariance
matrix is required.
•Careful tuning of gradient descent
seems required.
•Only Adam was tested.
20
Future work
•Online learning for topic models with NN
•NN may achieve a better approximate posterior.
•SGVB can be used to estimate ELBO in a similar
manner.
•Document batches can be fed to VB indefinitely.
•Topic word lists are then updated indefinitely.
21

More Related Content

PPTX
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
Tomonari Masada
 
PDF
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
Tomonari Masada
 
PDF
Interactive Latent Dirichlet Allocation
Quentin Pleplé
 
PPT
Topic Models
Claudia Wagner
 
PDF
Skip gram and cbow
hyunyoung Lee
 
PDF
Author Topic Model
FReeze FRancis
 
PDF
text summarization using amr
amit nagarkoti
 
PPTX
Probabilistic Retrieval Models - Sean Golliher Lecture 8 MSU CSCI 494
Sean Golliher
 
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
Tomonari Masada
 
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
Tomonari Masada
 
Interactive Latent Dirichlet Allocation
Quentin Pleplé
 
Topic Models
Claudia Wagner
 
Skip gram and cbow
hyunyoung Lee
 
Author Topic Model
FReeze FRancis
 
text summarization using amr
amit nagarkoti
 
Probabilistic Retrieval Models - Sean Golliher Lecture 8 MSU CSCI 494
Sean Golliher
 

What's hot (6)

PPTX
Ir 09
Mohammed Romi
 
PPT
Сергей Кольцов —НИУ ВШЭ —ICBDA 2015
rusbase
 
PPTX
Word embeddings
Shruti kar
 
PDF
TopicModels_BleiPaper_Summary.pptx
Kalpit Desai
 
DOCX
..Ans 1
Vimmi Kaushal
 
PPTX
Probabilistic information retrieval models & systems
Selman Bozkır
 
Сергей Кольцов —НИУ ВШЭ —ICBDA 2015
rusbase
 
Word embeddings
Shruti kar
 
TopicModels_BleiPaper_Summary.pptx
Kalpit Desai
 
..Ans 1
Vimmi Kaushal
 
Probabilistic information retrieval models & systems
Selman Bozkır
 
Ad

Similar to A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model (20)

PDF
Tiancheng Zhao - 2017 - Learning Discourse-level Diversity for Neural Dialog...
Association for Computational Linguistics
 
PDF
Explicit Density Models
Sangwoo Mo
 
PDF
Sergey Nikolenko and Elena Tutubalina - Constructing Aspect-Based Sentiment ...
AIST
 
PPTX
Word_Embedding.pptx
NameetDaga1
 
PPTX
NS-CUK Seminar: H.B.Kim, Review on "subgraph2vec: Learning Distributed Repre...
ssuser4b1f48
 
PDF
Sergey Nikolenko and Anton Alekseev User Profiling in Text-Based Recommende...
AIST
 
PPTX
presenttat related toautomated text summ
ssuserc8d828
 
PDF
Discrete Diffusion Models - Presentation
Kien Duc Do
 
PDF
lecture2 - text classification prof cho NYU
ssuser04ac06
 
PPTX
PyData Los Angeles 2020 (Abhilash Majumder)
Abhilash Majumder
 
PPTX
A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks
Masahiro Kaneko
 
PDF
2021 03-02-distributed representations-of_words_and_phrases
JAEMINJEONG5
 
PPTX
What is word2vec?
Traian Rebedea
 
PDF
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...
CSCJournals
 
PPTX
Word2vec slide(lab seminar)
Jinpyo Lee
 
PDF
Reference Scope Identification of Citances Using Convolutional Neural Network
Saurav Jha
 
PPTX
Sujit Pal - Applying the four-step "Embed, Encode, Attend, Predict" framework...
PyData
 
PPTX
Understanding GloVe
JEE HYUN PARK
 
PPTX
Word embeddings
Ajay Taneja
 
PDF
Monte carlo dropout and variational bound
天乐 杨
 
Tiancheng Zhao - 2017 - Learning Discourse-level Diversity for Neural Dialog...
Association for Computational Linguistics
 
Explicit Density Models
Sangwoo Mo
 
Sergey Nikolenko and Elena Tutubalina - Constructing Aspect-Based Sentiment ...
AIST
 
Word_Embedding.pptx
NameetDaga1
 
NS-CUK Seminar: H.B.Kim, Review on "subgraph2vec: Learning Distributed Repre...
ssuser4b1f48
 
Sergey Nikolenko and Anton Alekseev User Profiling in Text-Based Recommende...
AIST
 
presenttat related toautomated text summ
ssuserc8d828
 
Discrete Diffusion Models - Presentation
Kien Duc Do
 
lecture2 - text classification prof cho NYU
ssuser04ac06
 
PyData Los Angeles 2020 (Abhilash Majumder)
Abhilash Majumder
 
A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks
Masahiro Kaneko
 
2021 03-02-distributed representations-of_words_and_phrases
JAEMINJEONG5
 
What is word2vec?
Traian Rebedea
 
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...
CSCJournals
 
Word2vec slide(lab seminar)
Jinpyo Lee
 
Reference Scope Identification of Citances Using Convolutional Neural Network
Saurav Jha
 
Sujit Pal - Applying the four-step "Embed, Encode, Attend, Predict" framework...
PyData
 
Understanding GloVe
JEE HYUN PARK
 
Word embeddings
Ajay Taneja
 
Monte carlo dropout and variational bound
天乐 杨
 
Ad

More from Tomonari Masada (20)

PDF
Learning Latent Space Energy Based Prior Modelの解説
Tomonari Masada
 
PDF
Denoising Diffusion Probabilistic Modelsの重要な式の解説
Tomonari Masada
 
PDF
Context-dependent Token-wise Variational Autoencoder for Topic Modeling
Tomonari Masada
 
PDF
A note on the density of Gumbel-softmax
Tomonari Masada
 
PPTX
トピックモデルの基礎と応用
Tomonari Masada
 
PDF
Expectation propagation for latent Dirichlet allocation
Tomonari Masada
 
PDF
Mini-batch Variational Inference for Time-Aware Topic Modeling
Tomonari Masada
 
PDF
A note on variational inference for the univariate Gaussian
Tomonari Masada
 
PDF
Document Modeling with Implicit Approximate Posterior Distributions
Tomonari Masada
 
PDF
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
Tomonari Masada
 
PDF
A Note on ZINB-VAE
Tomonari Masada
 
PDF
A Note on Latent LSTM Allocation
Tomonari Masada
 
PDF
A Note on TopicRNN
Tomonari Masada
 
PDF
Topic modeling with Poisson factorization (2)
Tomonari Masada
 
PDF
Poisson factorization
Tomonari Masada
 
TXT
Word count in Husserliana Volumes 1 to 28
Tomonari Masada
 
PDF
FDSE2015
Tomonari Masada
 
PDF
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
Tomonari Masada
 
PDF
A Note on BPTT for LSTM LM
Tomonari Masada
 
PDF
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
Tomonari Masada
 
Learning Latent Space Energy Based Prior Modelの解説
Tomonari Masada
 
Denoising Diffusion Probabilistic Modelsの重要な式の解説
Tomonari Masada
 
Context-dependent Token-wise Variational Autoencoder for Topic Modeling
Tomonari Masada
 
A note on the density of Gumbel-softmax
Tomonari Masada
 
トピックモデルの基礎と応用
Tomonari Masada
 
Expectation propagation for latent Dirichlet allocation
Tomonari Masada
 
Mini-batch Variational Inference for Time-Aware Topic Modeling
Tomonari Masada
 
A note on variational inference for the univariate Gaussian
Tomonari Masada
 
Document Modeling with Implicit Approximate Posterior Distributions
Tomonari Masada
 
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
Tomonari Masada
 
A Note on ZINB-VAE
Tomonari Masada
 
A Note on Latent LSTM Allocation
Tomonari Masada
 
A Note on TopicRNN
Tomonari Masada
 
Topic modeling with Poisson factorization (2)
Tomonari Masada
 
Poisson factorization
Tomonari Masada
 
Word count in Husserliana Volumes 1 to 28
Tomonari Masada
 
FDSE2015
Tomonari Masada
 
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
Tomonari Masada
 
A Note on BPTT for LSTM LM
Tomonari Masada
 
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
Tomonari Masada
 

Recently uploaded (20)

PDF
flutter Launcher Icons, Splash Screens & Fonts
Ahmed Mohamed
 
PPT
Ppt for engineering students application on field effect
lakshmi.ec
 
PDF
dse_final_merit_2025_26 gtgfffffcjjjuuyy
rushabhjain127
 
PDF
2025 Laurence Sigler - Advancing Decision Support. Content Management Ecommer...
Francisco Javier Mora Serrano
 
PPTX
MT Chapter 1.pptx- Magnetic particle testing
ABCAnyBodyCanRelax
 
PDF
EVS+PRESENTATIONS EVS+PRESENTATIONS like
saiyedaqib429
 
PDF
Zero Carbon Building Performance standard
BassemOsman1
 
DOCX
SAR - EEEfdfdsdasdsdasdasdasdasdasdasdasda.docx
Kanimozhi676285
 
PPTX
database slide on modern techniques for optimizing database queries.pptx
aky52024
 
PPT
1. SYSTEMS, ROLES, AND DEVELOPMENT METHODOLOGIES.ppt
zilow058
 
PDF
Zero carbon Building Design Guidelines V4
BassemOsman1
 
PDF
Natural_Language_processing_Unit_I_notes.pdf
sanguleumeshit
 
PDF
Biodegradable Plastics: Innovations and Market Potential (www.kiu.ac.ug)
publication11
 
PDF
settlement FOR FOUNDATION ENGINEERS.pdf
Endalkazene
 
PDF
Advanced LangChain & RAG: Building a Financial AI Assistant with Real-Time Data
Soufiane Sejjari
 
PDF
July 2025: Top 10 Read Articles Advanced Information Technology
ijait
 
PDF
Cryptography and Information :Security Fundamentals
Dr. Madhuri Jawale
 
PPTX
Inventory management chapter in automation and robotics.
atisht0104
 
PDF
FLEX-LNG-Company-Presentation-Nov-2017.pdf
jbloggzs
 
PPT
SCOPE_~1- technology of green house and poyhouse
bala464780
 
flutter Launcher Icons, Splash Screens & Fonts
Ahmed Mohamed
 
Ppt for engineering students application on field effect
lakshmi.ec
 
dse_final_merit_2025_26 gtgfffffcjjjuuyy
rushabhjain127
 
2025 Laurence Sigler - Advancing Decision Support. Content Management Ecommer...
Francisco Javier Mora Serrano
 
MT Chapter 1.pptx- Magnetic particle testing
ABCAnyBodyCanRelax
 
EVS+PRESENTATIONS EVS+PRESENTATIONS like
saiyedaqib429
 
Zero Carbon Building Performance standard
BassemOsman1
 
SAR - EEEfdfdsdasdsdasdasdasdasdasdasdasda.docx
Kanimozhi676285
 
database slide on modern techniques for optimizing database queries.pptx
aky52024
 
1. SYSTEMS, ROLES, AND DEVELOPMENT METHODOLOGIES.ppt
zilow058
 
Zero carbon Building Design Guidelines V4
BassemOsman1
 
Natural_Language_processing_Unit_I_notes.pdf
sanguleumeshit
 
Biodegradable Plastics: Innovations and Market Potential (www.kiu.ac.ug)
publication11
 
settlement FOR FOUNDATION ENGINEERS.pdf
Endalkazene
 
Advanced LangChain & RAG: Building a Financial AI Assistant with Real-Time Data
Soufiane Sejjari
 
July 2025: Top 10 Read Articles Advanced Information Technology
ijait
 
Cryptography and Information :Security Fundamentals
Dr. Madhuri Jawale
 
Inventory management chapter in automation and robotics.
atisht0104
 
FLEX-LNG-Company-Presentation-Nov-2017.pdf
jbloggzs
 
SCOPE_~1- technology of green house and poyhouse
bala464780
 

A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model

  • 1. A Simple SGVB (Stochastic Gradient Variational Bayes) for the CTM (Correlated Topic Model) Tomonari MASADA (正田备也) Nagasaki University (长崎大学) [email protected] APWeb 2016 @ Suzhou
  • 2. Aim •Make an informative summary of large document sets by •extracting word lists, each relating to a different and particular topic.  Topic modeling 2
  • 4. Contribution •We propose a new posterior estimation for the correlated topic model (CTM) [Blei+ 07], •an extension of LDA [Blei+ 03] for modeling topic correlations, •with stochastic gradient variational Bayes (SGVB) [Kingma+ 14]. 4
  • 5. LDA [Blei+ 03] •Clustering word tokens by assigning each word token to one among the 𝐾 topics. • 𝑧 𝑑𝑖: To which topic is the 𝑖-th word token in document 𝑑 is assigned? • 𝜃 𝑑𝑘: How often is the topic 𝑘 talked about in document 𝑑? • Multinomial distribution for each 𝑑 • 𝜙 𝑘𝑣: How often is the word 𝑣 used to talk about the topic 𝑘? • Multinomial distribution for each 𝑘 discrete variables continuous variables 5
  • 6. CTM [Blei+ 05] •Clustering word tokens by assigning each word token to one among the 𝐾 topics. • 𝑧 𝑑𝑖: To which topic is the 𝑖-th word token in document 𝑑 is assigned? • 𝜃 𝑑𝑘: How often is the topic 𝑘 talked about in document 𝑑? • 𝜽 𝑑 = 𝑓 𝜼 𝑑 where 𝜼 𝑑~𝑁 𝝁, 𝚺 (logistic normal distribution) • 𝜙 𝑘𝑣: How often is the word 𝑣 used to talk about the topic 𝑘? • Multinomial distribution for each 𝑘 discrete variables continuous variables 6
  • 7. Variational Bayes Maximization of ELBO (evidence lower bound) •VB (variational Bayes) approximates the true posterior. •An approximate posterior is introduced when ELBO is obtained by Jensen's inequality: • 𝒛: discrete hidden variables (topic assignments) • 𝚯: continuous hidden variables (multinomial parameters) 7 log evidence approximate posterior 𝑞(𝒛, 𝚯)
  • 8. Factorization assumption •We assume the approximate posterior 𝑞 𝒛, 𝚯 factorizes as 𝑞 𝒛 𝑞 𝚯 . •Then ELBO can be written as 8 ×discrete continuous
  • 9. SGVB [Kingma+ 14] •SGVB (stochastic gradient variational Bayes) is a general framework for estimating ELBO in VB. •SGVB is only applicable to continuous distributions 𝑞 𝚯 . •Monte Carlo integration for expectation 9
  • 10. Reparameterization •We use the diagonal logistic normal for approximating the true posterior of 𝜽 𝑑. •We can efficiently sample from the logistic normal with reparameterization. 10
  • 11. Monte Carlo integration •ELBO is estimated with a sample from the approximate posterior. • The discrete part 𝑞 𝒛 is estimated as in the original VB. 11
  • 12. Parameter updates No explicit inversion (only Cholesky factorization) 12
  • 13. "Stochastic" gradient •The expectation integrations are estimated by Monte Carlo method. •The derivatives of ELBO depend on samples. •Randomness is incorporated into the maximization of ELBO. •Does this make it easier to avoid local minima? 13
  • 14. Data sets # docs # word types NYT 149,890 46,650 MOVIE 27,859 62,408 NSF 128,818 21,471 MED 125,490 42,83014
  • 19. Conclusion •We incorporate randomness into the posterior inference for the CTM by using SGVB. •The proposed method gives perplexities comparable to those achieved by LDA. 19
  • 20. Pro/Con •No explicit inversion of covariance matrix is required. •Careful tuning of gradient descent seems required. •Only Adam was tested. 20
  • 21. Future work •Online learning for topic models with NN •NN may achieve a better approximate posterior. •SGVB can be used to estimate ELBO in a similar manner. •Document batches can be fed to VB indefinitely. •Topic word lists are then updated indefinitely. 21