Understanding the Origin of
Bias in Word Embeddings
•
• R&D Engineer @ JTC
• NLP, ML
2
• Understanding the Origins of Bias in Word Embeddings
• Marc-Etienne Brunet, Colleen Alkalay-Houlihan, Ashton Anderson, Richard
Memel
• University of Toronto, Vector Institute for Artificial Intelligence
• ICML 2019
•
•
•
•
•
3
•
•
•
•
•
•
4
⃗wman − ⃗wwoman ≃ ⃗wcomputer programmer − ⃗whomemaker
• GloVe
• GloVe
•
5
GloVe [Pennington+ (2014)]
•
• PPMI X
• X 1 :1Xi
6
V
V
wi, uj ∈ ℝD
α, xmax
7
•
• Koh & Liang(2017)
•
• k
• ( )
• 0 R, L
• 1
θ*
zk zk → ˜zk
˜θ ϵ ≪ 1
θ
θ* ϵ
zi
˜θ θ* ∇θ L, ∇2
θ L
GloVe ” ”
• ” ”
• PPMI
GloVe ( )
•
( ˜X) ˜X
˜w
˜w
8V
V
A
A
V
V
B
A
Xi Xij
˜Xi ˜Xij
GloVe
• GloVe
•
• i
Xi
∇wi
L, ∇2
wi
L
˜wi
9
zi → Xi θ → wi
: WEAT
• The Word Embedding Assosiation Test [Caliskan+ (2017)]
10
Bweat
𝒮, 𝒯, 𝒜, ℬ
Bweat ∼ ( ⃗𝒮cent − ⃗𝒯 cent) ⋅ ( ⃗𝒜cent − ⃗ℬ cent) ⃗𝒮cent 𝒮
𝒮, 𝒯 𝒜, ℬ
• p
• k PPMI
PPMI B X
• 1
• k
X(k)
X X = Σn
k=1X(k)
B(w( ˜X)) X
11
( ∵ ˜X = X − X(k)
)
12
•
•
•
• word2vec
GloVe
13
14
• Table2
• Wiki-WEAT2
• NYT-WEAT1
• Supplemental Material
Bweat ∈ [−2.0,2.0]
15
NYT-WEAT1
•
•
( δ ( ))
16
•
• {increase, decrease, random} 3
• increase(decrease) ( ) m
increase(decrease)-m
• random
17
• B
• Ground Truth ,
•
• :Ground Truth
• B
• 

B 0
r2
≥ 0.985
18
( PPMI)
• word2vec, GloVe
• PPMI V 2000
• GloVe 300 word2vec 10000
• 1.4M debias
• Top1 Acc.
19
• WEAT1 (S,T)
• 10k
• [Bolukbasi+,2016]
• base 1.14 unperturbed
• T female
• S male
• inc/dec
Fig.5 T
(female ) S
( )
Fig.5 base gender axis
( )
•
20
•
• NYT-WEAT1 "For Women in Astronomy"
•
• NYT-WEAT1 ”The Guide”
• WEAT ( )
• 2
•
• WEAT ( )
•
• 0.07%
(Fig.3)
r2
= 0.828
21
• GloVe
•
•
22

More Related Content

PDF
Forecasting using R
PDF
[論文紹介] Understanding and improving transformer from a multi particle dynamic ...
PDF
[論文紹介] Towards Understanding Linear Word Analogies
PDF
Lpixel論文読み会資料 "Interpretation of neural network is fragile"
PPTX
[NeurIPS2018読み会@PFN] On the Dimensionality of Word Embedding
PPTX
[研究室論文紹介用スライド] Adversarial Contrastive Estimation
PPTX
Probabilistic fasttext for multi sense word embeddings
PPTX
Deep neural models of semantic shift
Forecasting using R
[論文紹介] Understanding and improving transformer from a multi particle dynamic ...
[論文紹介] Towards Understanding Linear Word Analogies
Lpixel論文読み会資料 "Interpretation of neural network is fragile"
[NeurIPS2018読み会@PFN] On the Dimensionality of Word Embedding
[研究室論文紹介用スライド] Adversarial Contrastive Estimation
Probabilistic fasttext for multi sense word embeddings
Deep neural models of semantic shift

Recently uploaded (20)

PPTX
Spectroscopic Techniques for M Tech Civil Engineerin .pptx
PPTX
Basic principles of chromatography techniques
PPT
ecg for noob ecg interpretation ecg recall
PDF
Communicating Health Policies to Diverse Populations (www.kiu.ac.ug)
PDF
Is Earendel a Star Cluster?: Metal-poor Globular Cluster Progenitors at z ∼ 6
PPTX
Presentation1 INTRODUCTION TO ENZYMES.pptx
PPT
Enhancing Laboratory Quality Through ISO 15189 Compliance
PPTX
ELISA(Enzyme linked immunosorbent assay)
PPTX
BPharm_Hospital_Organization_Complete_PPT.pptx
PPTX
Preformulation.pptx Preformulation studies-Including all parameter
PPTX
Understanding the Circulatory System……..
PPTX
limit test definition and all limit tests
PPTX
Toxicity Studies in Drug Development Ensuring Safety, Efficacy, and Global Co...
PPT
THE CELL THEORY AND ITS FUNDAMENTALS AND USE
PDF
Science Form five needed shit SCIENEce so
PDF
From Molecular Interactions to Solubility in Deep Eutectic Solvents: Explorin...
PDF
Metabolic Acidosis. pa,oakw,llwla,wwwwqw
PDF
ECG Practice from Passmedicine for MRCP Part 2 2024.pdf
PDF
THE-VITAL-ROLE-OF-MITOCHONDRIAL-RESPIRATION-IN-PLANT-GROWTH-AND-DEVELOPMENT.pdf
PPT
Cell Structure Description and Functions
Spectroscopic Techniques for M Tech Civil Engineerin .pptx
Basic principles of chromatography techniques
ecg for noob ecg interpretation ecg recall
Communicating Health Policies to Diverse Populations (www.kiu.ac.ug)
Is Earendel a Star Cluster?: Metal-poor Globular Cluster Progenitors at z ∼ 6
Presentation1 INTRODUCTION TO ENZYMES.pptx
Enhancing Laboratory Quality Through ISO 15189 Compliance
ELISA(Enzyme linked immunosorbent assay)
BPharm_Hospital_Organization_Complete_PPT.pptx
Preformulation.pptx Preformulation studies-Including all parameter
Understanding the Circulatory System……..
limit test definition and all limit tests
Toxicity Studies in Drug Development Ensuring Safety, Efficacy, and Global Co...
THE CELL THEORY AND ITS FUNDAMENTALS AND USE
Science Form five needed shit SCIENEce so
From Molecular Interactions to Solubility in Deep Eutectic Solvents: Explorin...
Metabolic Acidosis. pa,oakw,llwla,wwwwqw
ECG Practice from Passmedicine for MRCP Part 2 2024.pdf
THE-VITAL-ROLE-OF-MITOCHONDRIAL-RESPIRATION-IN-PLANT-GROWTH-AND-DEVELOPMENT.pdf
Cell Structure Description and Functions
Ad
Ad

Understanding the origin of bias in word embeddings