SlideShare a Scribd company logo
[Kim+ ICML2012] Dirichlet Process
 with Mixed Random Measures : A
  Nonparametric Topic Model for
          Labeled Data


            2012/07/28
 Nakatani Shuyo @ Cybozu Labs, Inc
          twitter : @shuyo
LDA(Latent Dirichlet Allocation)
             [Blei+ 03]
• Unsupervised Topic Model
  – Each word has an unobserved topic
• Parametric
  – The topic size K is given in advance




                     via Wikipedia
Labeled LDA [Ramage+ 09]

• Supervised Topic Model
  – Each document has an observed label
• Parametric




                  via [Ramage+ 09]
Generative Process for L-LDA
• 𝜷 𝑘 ~Dir 𝜼
                                                       topics corresponding to
          𝑑                                                observed labels
• Λ 𝑘 ~Bernoulli Φ 𝑘
• 𝜽       𝑑       ~Dir 𝜶    𝑑
                                                                    restricted to labeled
     – where 𝜶          𝑑   = 𝛼𝑘                                         parameters
                                               𝑑
                                            𝑘 Λ 𝑘 =1

          𝑑                     𝑑
• 𝑧 𝑖 ~Multi 𝜽
              𝑑
•    𝑤𝑖           ~Multi 𝜷 𝑧            𝑑
                                    𝑖

                                                             via [Ramage+ 09]
Pros/Cons of L-LDA
• Pros
  – Easy to implement


• Cons                                                      via [Ramage+ 09]

  – It is necessary to specify label-topic
    correspondence manually
     • Its performance depends on the corresponds

         ※) My implementation is here : https://github.com/shuyo/iir/blob/master/lda/llda.py
DP-MRM [Kim+ 12]
  – Dirichlet Process with Mixed Random Measures

• Supervised Topic Model
• Nonparametric
  – K is not the topic size, but the label size
                                   𝛼

                                                      𝑁𝑗

                     𝐻       𝐺0𝑘   𝐺𝑗   𝜃 𝑗𝑖   𝑥 𝑗𝑖


                                   𝜆j   𝑟𝑗             𝐷
                 𝛽           𝛾𝑘    𝜂
                         𝐾
Generative Process for DP-MRM
                                                           𝛼
        Each label has a random
        measure as topic space                                                  𝑁𝑗
                                           𝐻       𝐺0𝑘     𝐺𝑗     𝜃 𝑗𝑖   𝑥 𝑗𝑖
• 𝐻 = Dir 𝛽
                                                           𝜆j     𝑟𝑗             𝐷
• 𝐺0𝑘 ~DP 𝛾 𝑘 , 𝐻                      𝛽
                                               𝐾
                                                   𝛾𝑘      𝜂


• 𝜆 𝑗 ~Dir 𝒓 𝑗 𝜂 where 𝒓 𝑗 = 𝐼 𝑘∈label                     𝑗

• 𝐺 𝑗 ~DP 𝛼,        𝑘∈label 𝑗     𝜆 𝑗𝑘 𝐺0𝑘               mixed random measures


• 𝜃 𝑗𝑖 ~𝐺 𝑗 , 𝑥 𝑗𝑖 ~𝐹 𝜃 𝑗𝑖 = Multi 𝜃 𝑗𝑖
Stick Breaking Process
•   𝑣 𝑙 𝑘 ~Beta 1, 𝛾 𝑘 , 𝜋 𝑙𝑘 = 𝑣 𝑙 𝑘      𝑙−1
                                            𝑑=0   1 − 𝑣 𝑑𝑘

•   𝜙 𝑙𝑘 ~𝐻, 𝐺0𝑘 =      ∞
                        𝑙=0   𝜋 𝑙𝑘 𝛿 𝜙 𝑘
                                      𝑙
                                                                  𝑡−1
• 𝜆 𝑗 ~Dir 𝒓 𝑗 𝜂 , 𝑤 𝑗𝑡 ~Beta 1, 𝛼 , 𝜋 𝑗𝑡 = 𝑤 𝑗𝑡                  𝑑=0   1 − 𝑤 𝑗𝑑
                                 𝑘 𝑗𝑡              ∞
•   𝑘 𝑗𝑡 ~Multi 𝜆 𝑗 ,    𝜓 𝑗𝑡 ~𝐺0 ,        𝐺𝑗 =    𝑡=0   𝜋 𝑗𝑡 𝛿 𝜓 𝑗𝑡
Chinese Restaurant Franchise
• 𝑡 𝑗𝑖 : table index of 𝑖-th term in 𝑗-th document
• 𝑘 𝑗𝑡 , 𝑙 𝑗𝑡 : dish indexes on 𝑡-th table of 𝑗-th
  document                                   This layer consists on
                                                   only a single DP G0
                                                    on normal HDP
Inference (1)



• Sampling 𝑡
Inference (2)
• Sampling 𝑘 and 𝑙
Experiments
• DP-MRM gives label-topic probabilistic
  corresponding automatically.




                   via [Kim+ 12]
via [Kim+ 12]

• L-LDA can also predict single labeled document to
  assign a common second label to any documents.
References
• [Kim+ ICML2012] Dirichlet Process with Mixed
  Random Measures : A Nonparametric Topic
  Model for Labeled Data
• [Ramage+ EMNLP2009] Labeled LDA : A
  supervised topic model for credit attribution in
  multi-labeled corpora
• [Blei+ 2003] Latent Dirichlet Allocation

More Related Content

Viewers also liked (16)

PDF
[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing Systems
Shuyo Nakatani
 
PDF
Manifold learning with application to object recognition
zukun
 
PDF
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
Ryan B Harvey, CSDP, CSM
 
PPTX
Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...
wl820609
 
PDF
The Gaussian Process Latent Variable Model (GPLVM)
James McMurray
 
PPT
Topic Models
Claudia Wagner
 
PDF
関東CV勉強会 Kernel PCA (2011.2.19)
Akisato Kimura
 
PPTX
Self-organizing map
Tarat Diloksawatdikul
 
PDF
WSDM2016読み会 Collaborative Denoising Auto-Encoders for Top-N Recommender Systems
Kotaro Tanahashi
 
PDF
Visualizing Data Using t-SNE
Tomoki Hayashi
 
PDF
AutoEncoderで特徴抽出
Kai Sasaki
 
PDF
LDA入門
正志 坪坂
 
PDF
非線形データの次元圧縮 150905 WACODE 2nd
Mika Yoshimura
 
PDF
CVIM#11 3. 最小化のための数値計算
sleepy_yoshi
 
PDF
Numpy scipyで独立成分分析
Shintaro Fukushima
 
PDF
Hyperoptとその周辺について
Keisuke Hosaka
 
[Karger+ NIPS11] Iterative Learning for Reliable Crowdsourcing Systems
Shuyo Nakatani
 
Manifold learning with application to object recognition
zukun
 
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
Ryan B Harvey, CSDP, CSM
 
Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...
wl820609
 
The Gaussian Process Latent Variable Model (GPLVM)
James McMurray
 
Topic Models
Claudia Wagner
 
関東CV勉強会 Kernel PCA (2011.2.19)
Akisato Kimura
 
Self-organizing map
Tarat Diloksawatdikul
 
WSDM2016読み会 Collaborative Denoising Auto-Encoders for Top-N Recommender Systems
Kotaro Tanahashi
 
Visualizing Data Using t-SNE
Tomoki Hayashi
 
AutoEncoderで特徴抽出
Kai Sasaki
 
LDA入門
正志 坪坂
 
非線形データの次元圧縮 150905 WACODE 2nd
Mika Yoshimura
 
CVIM#11 3. 最小化のための数値計算
sleepy_yoshi
 
Numpy scipyで独立成分分析
Shintaro Fukushima
 
Hyperoptとその周辺について
Keisuke Hosaka
 

Similar to [Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data (20)

PDF
A Simple Instance-Based Approach to Multilabel Classi cation Using the Mallow...
roywwcheng
 
PDF
Gentle Introduction to Dirichlet Processes
Yap Wooi Hen
 
PDF
Latent factor models for Collaborative Filtering
sscdotopen
 
PDF
Label propagation - Semisupervised Learning with Applications to NLP
David Przybilla
 
PDF
Label Ranking with Partial Abstention using Ensemble Learning
roywwcheng
 
PDF
Generalized Reinforcement Learning
Po-Hsiang (Barnett) Chiu
 
PDF
Some fixed point theorems in generalised dislocated metric spaces
Alexander Decker
 
PDF
11.some fixed point theorems in generalised dislocated metric spaces
Alexander Decker
 
PDF
Multilabel Classification by BCH Code and Random Forests
IDES Editor
 
PPTX
Simple Matrix Factorization for Recommendation in Mahout
Data Science London
 
PDF
Rohan's Masters presentation
rohan_anil
 
PDF
Diversity versus accuracy: solving the apparent dilemma facing recommender sy...
Aliaksandr Birukou
 
PDF
Introduction to Machine Learning
kkkc
 
PDF
Bayesian Methods for Machine Learning
butest
 
PDF
icml2004 tutorial on bayesian methods for machine learning
zukun
 
PDF
Poster DDP (BNP 2011 Veracruz)
Julyan Arbel
 
PDF
MDL/Bayesian Criteria based on Universal Coding/Measure
Joe Suzuki
 
PDF
Jackknife algorithm for the estimation of logistic regression parameters
Alexander Decker
 
PDF
Dataanalysis2
Olga Moreira
 
A Simple Instance-Based Approach to Multilabel Classi cation Using the Mallow...
roywwcheng
 
Gentle Introduction to Dirichlet Processes
Yap Wooi Hen
 
Latent factor models for Collaborative Filtering
sscdotopen
 
Label propagation - Semisupervised Learning with Applications to NLP
David Przybilla
 
Label Ranking with Partial Abstention using Ensemble Learning
roywwcheng
 
Generalized Reinforcement Learning
Po-Hsiang (Barnett) Chiu
 
Some fixed point theorems in generalised dislocated metric spaces
Alexander Decker
 
11.some fixed point theorems in generalised dislocated metric spaces
Alexander Decker
 
Multilabel Classification by BCH Code and Random Forests
IDES Editor
 
Simple Matrix Factorization for Recommendation in Mahout
Data Science London
 
Rohan's Masters presentation
rohan_anil
 
Diversity versus accuracy: solving the apparent dilemma facing recommender sy...
Aliaksandr Birukou
 
Introduction to Machine Learning
kkkc
 
Bayesian Methods for Machine Learning
butest
 
icml2004 tutorial on bayesian methods for machine learning
zukun
 
Poster DDP (BNP 2011 Veracruz)
Julyan Arbel
 
MDL/Bayesian Criteria based on Universal Coding/Measure
Joe Suzuki
 
Jackknife algorithm for the estimation of logistic regression parameters
Alexander Decker
 
Dataanalysis2
Olga Moreira
 
Ad

More from Shuyo Nakatani (20)

PDF
画像をテキストで検索したい!(OpenAI CLIP) - VRC-LT #15
Shuyo Nakatani
 
PDF
Generative adversarial networks
Shuyo Nakatani
 
PDF
無限関係モデル (続・わかりやすいパターン認識 13章)
Shuyo Nakatani
 
PDF
Memory Networks (End-to-End Memory Networks の Chainer 実装)
Shuyo Nakatani
 
PDF
人工知能と機械学習の違いって?
Shuyo Nakatani
 
PDF
RとStanでクラウドセットアップ時間を分析してみたら #TokyoR
Shuyo Nakatani
 
PDF
ドラえもんでわかる統計的因果推論 #TokyoR
Shuyo Nakatani
 
PDF
[Yang, Downey and Boyd-Graber 2015] Efficient Methods for Incorporating Knowl...
Shuyo Nakatani
 
PDF
星野「調査観察データの統計科学」第3章
Shuyo Nakatani
 
PDF
星野「調査観察データの統計科学」第1&2章
Shuyo Nakatani
 
PDF
言語処理するのに Python でいいの? #PyDataTokyo
Shuyo Nakatani
 
PDF
Zipf? (ジップ則のひみつ?) #DSIRNLP
Shuyo Nakatani
 
PDF
ACL2014 Reading: [Zhang+] "Kneser-Ney Smoothing on Expected Count" and [Pickh...
Shuyo Nakatani
 
PDF
ソーシャルメディアの多言語判定 #SoC2014
Shuyo Nakatani
 
PDF
猫に教えてもらうルベーグ可測
Shuyo Nakatani
 
PDF
アラビア語とペルシャ語の見分け方 #DSIRNLP 5
Shuyo Nakatani
 
PDF
どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013
Shuyo Nakatani
 
PDF
Active Learning 入門
Shuyo Nakatani
 
PDF
数式を綺麗にプログラミングするコツ #spro2013
Shuyo Nakatani
 
PDF
ノンパラベイズ入門の入門
Shuyo Nakatani
 
画像をテキストで検索したい!(OpenAI CLIP) - VRC-LT #15
Shuyo Nakatani
 
Generative adversarial networks
Shuyo Nakatani
 
無限関係モデル (続・わかりやすいパターン認識 13章)
Shuyo Nakatani
 
Memory Networks (End-to-End Memory Networks の Chainer 実装)
Shuyo Nakatani
 
人工知能と機械学習の違いって?
Shuyo Nakatani
 
RとStanでクラウドセットアップ時間を分析してみたら #TokyoR
Shuyo Nakatani
 
ドラえもんでわかる統計的因果推論 #TokyoR
Shuyo Nakatani
 
[Yang, Downey and Boyd-Graber 2015] Efficient Methods for Incorporating Knowl...
Shuyo Nakatani
 
星野「調査観察データの統計科学」第3章
Shuyo Nakatani
 
星野「調査観察データの統計科学」第1&2章
Shuyo Nakatani
 
言語処理するのに Python でいいの? #PyDataTokyo
Shuyo Nakatani
 
Zipf? (ジップ則のひみつ?) #DSIRNLP
Shuyo Nakatani
 
ACL2014 Reading: [Zhang+] "Kneser-Ney Smoothing on Expected Count" and [Pickh...
Shuyo Nakatani
 
ソーシャルメディアの多言語判定 #SoC2014
Shuyo Nakatani
 
猫に教えてもらうルベーグ可測
Shuyo Nakatani
 
アラビア語とペルシャ語の見分け方 #DSIRNLP 5
Shuyo Nakatani
 
どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013
Shuyo Nakatani
 
Active Learning 入門
Shuyo Nakatani
 
数式を綺麗にプログラミングするコツ #spro2013
Shuyo Nakatani
 
ノンパラベイズ入門の入門
Shuyo Nakatani
 
Ad

Recently uploaded (20)

PDF
July Patch Tuesday
Ivanti
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Impact of IEEE Computer Society in Advancing Emerging Technologies including ...
Hironori Washizaki
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
Predicting the unpredictable: re-engineering recommendation algorithms for fr...
Speck&Tech
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
Building Resilience with Digital Twins : Lessons from Korea
SANGHEE SHIN
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PPTX
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PPTX
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
July Patch Tuesday
Ivanti
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Impact of IEEE Computer Society in Advancing Emerging Technologies including ...
Hironori Washizaki
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Predicting the unpredictable: re-engineering recommendation algorithms for fr...
Speck&Tech
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Building Resilience with Digital Twins : Lessons from Korea
SANGHEE SHIN
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 

[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data

  • 1. [Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data 2012/07/28 Nakatani Shuyo @ Cybozu Labs, Inc twitter : @shuyo
  • 2. LDA(Latent Dirichlet Allocation) [Blei+ 03] • Unsupervised Topic Model – Each word has an unobserved topic • Parametric – The topic size K is given in advance via Wikipedia
  • 3. Labeled LDA [Ramage+ 09] • Supervised Topic Model – Each document has an observed label • Parametric via [Ramage+ 09]
  • 4. Generative Process for L-LDA • 𝜷 𝑘 ~Dir 𝜼 topics corresponding to 𝑑 observed labels • Λ 𝑘 ~Bernoulli Φ 𝑘 • 𝜽 𝑑 ~Dir 𝜶 𝑑 restricted to labeled – where 𝜶 𝑑 = 𝛼𝑘 parameters 𝑑 𝑘 Λ 𝑘 =1 𝑑 𝑑 • 𝑧 𝑖 ~Multi 𝜽 𝑑 • 𝑤𝑖 ~Multi 𝜷 𝑧 𝑑 𝑖 via [Ramage+ 09]
  • 5. Pros/Cons of L-LDA • Pros – Easy to implement • Cons via [Ramage+ 09] – It is necessary to specify label-topic correspondence manually • Its performance depends on the corresponds ※) My implementation is here : https://github.com/shuyo/iir/blob/master/lda/llda.py
  • 6. DP-MRM [Kim+ 12] – Dirichlet Process with Mixed Random Measures • Supervised Topic Model • Nonparametric – K is not the topic size, but the label size 𝛼 𝑁𝑗 𝐻 𝐺0𝑘 𝐺𝑗 𝜃 𝑗𝑖 𝑥 𝑗𝑖 𝜆j 𝑟𝑗 𝐷 𝛽 𝛾𝑘 𝜂 𝐾
  • 7. Generative Process for DP-MRM 𝛼 Each label has a random measure as topic space 𝑁𝑗 𝐻 𝐺0𝑘 𝐺𝑗 𝜃 𝑗𝑖 𝑥 𝑗𝑖 • 𝐻 = Dir 𝛽 𝜆j 𝑟𝑗 𝐷 • 𝐺0𝑘 ~DP 𝛾 𝑘 , 𝐻 𝛽 𝐾 𝛾𝑘 𝜂 • 𝜆 𝑗 ~Dir 𝒓 𝑗 𝜂 where 𝒓 𝑗 = 𝐼 𝑘∈label 𝑗 • 𝐺 𝑗 ~DP 𝛼, 𝑘∈label 𝑗 𝜆 𝑗𝑘 𝐺0𝑘 mixed random measures • 𝜃 𝑗𝑖 ~𝐺 𝑗 , 𝑥 𝑗𝑖 ~𝐹 𝜃 𝑗𝑖 = Multi 𝜃 𝑗𝑖
  • 8. Stick Breaking Process • 𝑣 𝑙 𝑘 ~Beta 1, 𝛾 𝑘 , 𝜋 𝑙𝑘 = 𝑣 𝑙 𝑘 𝑙−1 𝑑=0 1 − 𝑣 𝑑𝑘 • 𝜙 𝑙𝑘 ~𝐻, 𝐺0𝑘 = ∞ 𝑙=0 𝜋 𝑙𝑘 𝛿 𝜙 𝑘 𝑙 𝑡−1 • 𝜆 𝑗 ~Dir 𝒓 𝑗 𝜂 , 𝑤 𝑗𝑡 ~Beta 1, 𝛼 , 𝜋 𝑗𝑡 = 𝑤 𝑗𝑡 𝑑=0 1 − 𝑤 𝑗𝑑 𝑘 𝑗𝑡 ∞ • 𝑘 𝑗𝑡 ~Multi 𝜆 𝑗 , 𝜓 𝑗𝑡 ~𝐺0 , 𝐺𝑗 = 𝑡=0 𝜋 𝑗𝑡 𝛿 𝜓 𝑗𝑡
  • 9. Chinese Restaurant Franchise • 𝑡 𝑗𝑖 : table index of 𝑖-th term in 𝑗-th document • 𝑘 𝑗𝑡 , 𝑙 𝑗𝑡 : dish indexes on 𝑡-th table of 𝑗-th document This layer consists on only a single DP G0 on normal HDP
  • 11. Inference (2) • Sampling 𝑘 and 𝑙
  • 12. Experiments • DP-MRM gives label-topic probabilistic corresponding automatically. via [Kim+ 12]
  • 13. via [Kim+ 12] • L-LDA can also predict single labeled document to assign a common second label to any documents.
  • 14. References • [Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametric Topic Model for Labeled Data • [Ramage+ EMNLP2009] Labeled LDA : A supervised topic model for credit attribution in multi-labeled corpora • [Blei+ 2003] Latent Dirichlet Allocation