SlideShare a Scribd company logo
Link Prediction in (Partially)
Aligned Heterogeneous Social
Networks
By: Sina Sajadmanesh
Advisor: Dr. Hamid Reza Rabiee
2 DMLLink Prediction DMLDMLLink Prediction2
Outline
Introduction
 Problem Formulation
 Applications
Link Prediction in Homogeneous Networks
Link Prediction in Heterogeneous Networks
Link Prediction in Aligned Networks
Future Works
3 DMLLink Prediction DMLDMLLink Prediction3
Introduction
Problem
 Based on a snapshot of
network, predicting the set
of potential links to be
formed in the future is
formally defined as Link
Prediction Problem.
 First proposed by Liben-
Novel and Klienberg in
CIKM 2003
4 DMLLink Prediction DML4
Applications
Social networks and E-commerce
 Recommender systems
 Friend recommendation
 Bioinformatics
 Prioritization of candidate disease genes
 Drug discovery
 Security
 Identify missing links between criminals
 Controlling computer viruses
5 DMLLink Prediction DMLDMLLink Prediction5
Outline
Introduction
Link Prediction in Homogeneous Networks
 Unsupervised methods
 Supervised methods
Link Prediction in Heterogeneous Networks
Link Prediction in Aligned Networks
Future Works
6 DMLLink Prediction DMLDMLLink Prediction6
Link Prediction in Homogeneous Networks
Homogeneous Network
𝐺 = (𝑉, 𝐸)
 If V contains one single type nodes and E
contains one single type of links, then G is
a homogeneous network
7 DMLLink Prediction DMLDMLLink Prediction7
Link Prediction in Homogeneous Networks
Unsupervised methods
 Measuring the closeness among nodes
 Assuming that close nodes are more likely
to be connected
Unsupervised Link Predicators
 Local neighbor based predicators
 Path based predicators
 Random-Walk based predicators
8 DMLLink Prediction DMLDMLLink Prediction8
Unsupervised Methods
Local Neighbor based Link Predicators
 Preferential Attachment
𝑃𝐴 𝑢, 𝑣 = Γ(𝑢) Γ(𝑣)
user 𝑢 neighbor Γ(𝑢)
9 DMLLink Prediction DMLDMLLink Prediction9
Unsupervised Methods
Local Neighbor based Link Predicators
 Common Neighbor
𝐶𝑁 𝑢, 𝑣 = Γ(𝑢) ∩ Γ(𝑣)
user 𝑢 neighbor Γ(𝑢)
10 DMLLink Prediction DMLDMLLink Prediction10
Unsupervised Methods
Local Neighbor based Link Predicators
 Jaccard Coefficient
𝐽𝐶 𝑢, 𝑣 =
Γ(𝑢) ∩ Γ(𝑣)
Γ(𝑢) ∪ Γ(𝑣)
 Adamic/Adar
𝐴𝐴 𝑢, 𝑣 =
𝑤∈(Γ(𝑢)∩Γ(𝑣))
1
log( Γ(𝑤) )
 Resource Allocation
𝑅𝐴 𝑢, 𝑣 =
𝑤∈(Γ(𝑢)∩Γ(𝑣))
1
Γ(𝑤)
11 DMLLink Prediction DMLDMLLink Prediction11
Unsupervised Methods
Path based Link Predicators
 Shortest Path
𝑆𝑃 𝑢, 𝑣 = min{ 𝑃𝑢↝𝑣 }
 Katz Score
𝐾𝑎𝑡𝑧 𝑢, 𝑣 =
𝑙=1
∞
𝛽 𝑙
𝑃𝑢↝𝑣
𝑙
= (𝐼 − 𝛽𝐴)−1−𝐼
12 DMLLink Prediction DMLDMLLink Prediction12
Unsupervised Methods
Random-Walk based Link Predicators
 Hitting Time
𝐻𝑇 𝑢, 𝑣 = 1 +
𝑤∈Γ(𝑢)
𝑃𝑢,𝑤 𝐻𝑇(𝑤, 𝑣)
 Commute Time
𝐶𝑇 𝑢, 𝑣 = 𝐻𝑇 𝑢, 𝑣 + 𝐻𝑇(𝑣, 𝑢)
13 DMLLink Prediction DMLDMLLink Prediction13
Link Prediction in Homogeneous Networks
Other Unsupervised Methods
 Matrix Factorization methods
 Maximum Likelihood Methods
 Probabilistic Methods
14 DMLLink Prediction DMLDMLLink Prediction14
Link Prediction in Homogeneous Networks
Supervised Link Prediction
 Learning a binary classifier that will
predict whether a link exists between a
given pair of nodes or not.
 First proposed by Hassan et. al in SDM
2006
15 DMLLink Prediction DMLDMLLink Prediction15
Supervised Link Prediction
Dataset
 We need two snapshot of the network for
training
Positive Samples
 Links that are missing in former snapshot,
but are formed in latter snapshot
Negative Samples
 Links that are missing in both snapshots
16 DMLLink Prediction DMLDMLLink Prediction16
Supervised Link Prediction
Features
 Topological Features
 Attribute-based Features
Classifiers
 SVM
 Decision Trees
 Multilayer Perceptron
 KNN
 Naïve Bayes
17 DMLLink Prediction DMLDMLLink Prediction17
Outline
Introduction
Link Prediction in Homogeneous Networks
Link Prediction in Heterogeneous Networks
 Relationship Prediction
 Collective Link Prediction
Link Prediction in Aligned Networks
Future Works
18 DMLLink Prediction DMLDMLLink Prediction18
Link Prediction in Heterogeneous Networks
Heterogeneous Network
𝐺 = (𝑉, 𝐸)
 𝑉 = 𝑖 𝑉𝑖 is the sets of various kinds of
nodes and 𝑉𝑖 is the 𝑖 𝑡ℎ kind of nodes in G
 𝐸 = 𝑗 𝐸𝑗 is the sets of various types of
links and 𝐸𝑗 is the 𝑗𝑡ℎ kind of links in G
19 DMLLink Prediction DMLDMLLink Prediction19
Link Prediction in Heterogeneous Networks
Heterogeneous Network Schema
20 DMLLink Prediction DMLDMLLink Prediction20
Link Prediction in Heterogeneous Networks
From Link Prediction to Relationship Prediction
 A relationship between two objects could be
a composition of two or more links
• E.g. two authors have a co-author relationship if
they have co-written a paper
 Need to redesign topological features in
heterogeneous networks
21 DMLLink Prediction DMLDMLLink Prediction21
Supervised Relationship Prediction
Feature Extraction
 Heterogeneous Features
• Based on heterogeneous structure of network
 Meta-Path based Features
• Uses the concepts of meta-paths
• Different meta paths represent different semantic
meanings
• Number of path instanced of a meta-path Φ
22 DMLLink Prediction DMLDMLLink Prediction22
Supervised Relationship Prediction
Case Study
 Social Link Prediction
23 DMLLink Prediction DMLDMLLink Prediction23
Supervised Relationship Prediction
Heterogeneous Features
 Social Features
• Common Neighbor
• Jaccard Coefficient
• Adamic/Adar
 Spatial Features
• Common Locations
• Jaccard Coefficient of Common Locations
• Average Geographic distance of locations
• …
24 DMLLink Prediction DMLDMLLink Prediction24
Supervised Relationship Prediction
Heterogeneous Features
 Temporal Features
• Let T(u) be 24 hour activity vector of user u
• Inner product of T(u) and T(v)
• Cosine similarity of T(u) and T(v)
 Text Content Features
• Let w(u) be the bag-of-words vector of user u
weighted by TF-IDF
• Inner product of w(u) and w(v)
• Cosine similarity of w(u) and w(v)
25 DMLLink Prediction DMLDMLLink Prediction25
Supervised Relationship Prediction
Meta-Path based Features
 Φ1: Follower of Follower
• UUU
 Φ2: Common Out Neighbor
• UUU
 Φ3: Common In Neighbor
• UUU
 Φ4: Common Words
• UPWPU
 Φ5: Common Timestamps
• UPTPU
 Φ6: Common Location Check-ins
• UPLPU
26 DMLLink Prediction DMLDMLLink Prediction26
Link Prediction in Heterogeneous Networks
Collective Link Prediction
 Conventional link
prediction approaches
assume that links are
independent identically
distributed (i.i.d).
 But in heterogeneous
networks, different type
of links are correlated
and mutually influential.
27 DMLLink Prediction DMLDMLLink Prediction27
Link Prediction in Heterogeneous Networks
Input Social Network
28 DMLLink Prediction DMLDMLLink Prediction28
Link Prediction in Heterogeneous Networks
Independent Social and Location Link Prediction
29 DMLLink Prediction DMLDMLLink Prediction29
Link Prediction in Heterogeneous Networks
Traditional Link Prediction
𝑌𝑠 = 𝑎𝑟𝑔 max
𝑌𝑠
𝑃(𝑦 𝐿 𝑠 = 𝑌𝑠)
𝑌𝑙 = 𝑎𝑟𝑔 max
𝑌 𝑙
𝑃(𝑦 𝐿𝑙 = 𝑌𝑙)
 𝐿 𝑠 and 𝐿𝑙 are the sets of potential social
and location links
 𝑃 𝑦 𝐿 𝑠 = 𝑌𝑠 is the probability scores
achieved when links in 𝐿 𝑠 are assigned
with Labels 𝑌𝑠
 𝑌𝑠 and 𝑌𝑙 are the sets of optimal labels
30 DMLLink Prediction DMLDMLLink Prediction30
Link Prediction in Heterogeneous Networks
Collective Link Prediction
𝑌𝑠, 𝑌𝑙 = 𝑎𝑟𝑔 max
𝑌𝑠,𝑌 𝑙
𝑃(𝑦 𝐿 𝑠 = 𝑌𝑠|𝑦 𝐿𝑙 = 𝑌𝑙)
× 𝑃(𝑦 𝐿𝑙 = 𝑌𝑙|𝑦 𝐿 𝑠 = 𝑌𝑠)
𝑌𝑠
(𝑡)
= 𝑎𝑟𝑔 max
𝑌𝑠
𝑃(𝑦 𝐿 𝑠 = 𝑌𝑠 |
𝑦 𝐿 𝑠 = 𝑌𝑠
𝑡−1
, 𝑦 𝐿𝑙 = 𝑌𝑙
𝑡−1
)
𝑌𝑙
(𝑡)
= 𝑎𝑟𝑔 max
𝑌 𝑙
𝑃(𝑦 𝐿𝑙 = 𝑌𝑙 |
𝑦 𝐿 𝑠 = 𝑌𝑠
𝑡
, 𝑦 𝐿𝑙 = 𝑌𝑙
𝑡−1
)
IterativeSolution
31 DMLLink Prediction DMLDMLLink Prediction31
Outline
Introduction
Link Prediction in Homogeneous Networks
Link Prediction in Heterogeneous Networks
Link Prediction in Aligned Networks
 Link Transfer
 Anchor Link Inference
Future Works
32 DMLLink Prediction DMLDMLLink Prediction32
Link Prediction in Aligned Networks
Multi Aligned Heterogeneous Social Networks
33 DMLLink Prediction DMLDMLLink Prediction33
Link Prediction in Aligned Networks
Link Transfer
 Information Sparsity
• New Network Problem
• Cold Start Problem
 Need to transfer knowledge from another
domain
• Transfer Learning
• Use aligned network as a source
 Context
• Fully aligned networks
• Partially aligned networks
34 DMLLink Prediction DMLDMLLink Prediction34
Link Prediction in Aligned Networks
Link Transfer across Fully Aligned Networks
 Prediction using only the target network:
𝑃 𝑦 𝑢 𝑡, 𝑣 𝑡 = 1|𝐺 𝑡 = 𝑃 𝑦 𝑢 𝑡, 𝑣 𝑡 = 1|𝑥 𝑢 𝑡, 𝑣 𝑡
 Prediction using source and target network:
𝑃 𝑦 𝑢 𝑡, 𝑣 𝑡 = 1|𝐺 𝑡, 𝐺 𝑠
= 𝑃 𝑦 𝑢 𝑡
, 𝑣 𝑡
= 1| 𝑥 𝑢 𝑡
, 𝑣 𝑡 𝑇
, 𝑥 𝑢 𝑠
, 𝑣 𝑠 𝑇
, 𝑦 𝑢 𝑠
, 𝑣 𝑠 𝑇
35 DMLLink Prediction DMLDMLLink Prediction35
Link Prediction in Aligned Networks
Link Transfer across Partially Aligned Networks
 Solution: Inter-Network Meta-Paths
 Let 𝛾 𝑈 𝑡, 𝑈 𝑠 be the anchor meta-path
 Ψ1: 𝛾 𝑈 𝑡, 𝑈 𝑠 − Φ 𝑈 𝑠, 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡
 Ψ2: Φ 𝑈 𝑡, 𝑈 𝑡 − 𝛾 𝑈 𝑡, 𝑈 𝑠 − Φ 𝑈 𝑠, 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡
 Ψ3: 𝛾 𝑈 𝑡, 𝑈 𝑠 − Φ 𝑈 𝑠, 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡 − Φ 𝑈 𝑡, 𝑈 𝑡
 Ψ3: Φ 𝑈 𝑡
, 𝑈 𝑡
− 𝛾 𝑈 𝑡
, 𝑈 𝑠
− Φ 𝑈 𝑠
, 𝑈 𝑠
−
𝛾 𝑈 𝑠, 𝑈 𝑡 − Φ 𝑈 𝑡, 𝑈 𝑡
36 DMLLink Prediction DMLDMLLink Prediction36
Link Prediction in Aligned Networks
Anchor Link Inference
37 DMLLink Prediction DMLDMLLink Prediction37
Anchor Link Inference
Supervised Method
 Social Features
• Extended Common Neighbors
• Extended Jaccard Coefficient
• Extended Adamic/Adar
 Spatial Features
 Temporal Features
 Text Content Features
38 DMLLink Prediction DMLDMLLink Prediction38
Anchor Link Inference
Inference w.r.t One-to-One Constraint
 Predicted Scores
39 DMLLink Prediction DMLDMLLink Prediction39
Anchor Link Inference
Inference w.r.t One-to-One Constraint
 Conventional Link Prediction
40 DMLLink Prediction DMLDMLLink Prediction40
Anchor Link Inference
Inference w.r.t One-to-One Constraint
 Max sum of scores
41 DMLLink Prediction DMLDMLLink Prediction41
Anchor Link Inference
Inference w.r.t One-to-One Constraint
 Stable Matching
42 DMLLink Prediction DMLDMLLink Prediction42
Outline
Introduction
Link Prediction in Homogeneous Networks
Link Prediction in Heterogeneous Networks
Link Prediction in Aligned Networks
Future Works
43 DMLLink Prediction DMLDMLLink Prediction43
Future Works
Anchor Link Formation Prediction
 Predicting whether a user in the source network
will join the target network in the future
 This depends on the amount of influence he/she
receives from the target network
 An influence model can be learned using the
training data
 Positive-Unlabeled learning can improve the
prediction performance
44 DMLLink Prediction DMLDMLLink Prediction44
References
[1] L. Lü and T. Zhou, "Link prediction in complex networks: A survey," Physica A:
Statistical Mechanics and its Applications, vol. 390, pp. 1150-1170, 2011.
[2] M. Al Hasan, V. Chaoji, S. Salem, and M. Zaki, "Link prediction using supervised
learning," in SDM’06: Workshop on Link Analysis, Counter-terrorism and Security,
2006.
[3] J. Zhang and S. Y. Philip, "Link Prediction across Heterogeneous Social
Networks: A Survey," 2014.
[4] Y. Sun, R. Barber, M. Gupta, C. C. Aggarwal, and J. Han, "Co-author relationship
prediction in heterogeneous bibliographic networks," in Advances in Social
Networks Analysis and Mining (ASONAM), 2011 International Conference on, 2011,
pp. 121-128.
[5] X. Kong, J. Zhang, and P. S. Yu, "Inferring anchor links across multiple
heterogeneous social networks," in Proceedings of the 22nd ACM international
conference on Conference on information & knowledge management, 2013, pp. 179-
188.
Q&A

More Related Content

What's hot (16)

PDF
Scale-Free Networks to Search in Unstructured Peer-To-Peer Networks
IOSR Journals
 
PDF
Finding important nodes in social networks based on modified pagerank
csandit
 
PDF
An Improved PageRank Algorithm for Multilayer Networks
Subhajit Sahu
 
PDF
Identifying Most Relevant Node Path To Increase Connection Probability In Gra...
CSCJournals
 
PDF
UTILIZING XAI TECHNIQUE TO IMPROVE AUTOENCODER BASED MODEL FOR COMPUTER NETWO...
IJCNCJournal
 
PPTX
Network Flow
Royi Itzhak
 
PDF
Social network analysis
Sohom Ghosh
 
DOCX
Deepwalk vs Node2vec
SiddhantVerma49
 
PDF
Privacy Preserving Reputation Calculation in P2P Systems with Homomorphic Enc...
IJCNCJournal
 
PDF
Clustering Algorithms for Data Stream
IRJET Journal
 
PDF
Sub-Graph Finding Information over Nebula Networks
ijceronline
 
DOCX
Link Prediction in Evolving Networks Base on Information Propagation Screenshots
Venkat Projects
 
PDF
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...
PyData
 
PDF
Network embedding
SOYEON KIM
 
PDF
A Survey of Source Authentication Schemes for Multicast transfer in Adhoc Net...
ijsrd.com
 
PDF
A Proposed Algorithm to Detect the Largest Community Based On Depth Level
Eswar Publications
 
Scale-Free Networks to Search in Unstructured Peer-To-Peer Networks
IOSR Journals
 
Finding important nodes in social networks based on modified pagerank
csandit
 
An Improved PageRank Algorithm for Multilayer Networks
Subhajit Sahu
 
Identifying Most Relevant Node Path To Increase Connection Probability In Gra...
CSCJournals
 
UTILIZING XAI TECHNIQUE TO IMPROVE AUTOENCODER BASED MODEL FOR COMPUTER NETWO...
IJCNCJournal
 
Network Flow
Royi Itzhak
 
Social network analysis
Sohom Ghosh
 
Deepwalk vs Node2vec
SiddhantVerma49
 
Privacy Preserving Reputation Calculation in P2P Systems with Homomorphic Enc...
IJCNCJournal
 
Clustering Algorithms for Data Stream
IRJET Journal
 
Sub-Graph Finding Information over Nebula Networks
ijceronline
 
Link Prediction in Evolving Networks Base on Information Propagation Screenshots
Venkat Projects
 
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...
PyData
 
Network embedding
SOYEON KIM
 
A Survey of Source Authentication Schemes for Multicast transfer in Adhoc Net...
ijsrd.com
 
A Proposed Algorithm to Detect the Largest Community Based On Depth Level
Eswar Publications
 

Similar to Link Prediction in (Partially) Aligned Heterogeneous Social Networks (20)

PDF
IRJET- Link Prediction in Social Networks
IRJET Journal
 
DOCX
Link Prediction Survey
Patrick Walter
 
PDF
Predicting_new_friendships_in_social_networks
Anvardh Nanduri
 
PDF
Link prediction
Carlos Castillo (ChaTo)
 
DOCX
Final Report
Nithin Kakkireni
 
PDF
Predicting Communication Intention in Social Media
Charalampos Chelmis
 
PDF
An experimental evaluation of similarity-based and embedding-based link predi...
IJDKP
 
PDF
Predicting Anchor Links between Heterogeneous Social Networks
Sina Sajadmanesh
 
PDF
IRJET- A Survey on Link Prediction Techniques
IRJET Journal
 
PDF
LPCNN: convolutional neural network for link prediction based on network stru...
TELKOMNIKA JOURNAL
 
PPTX
Delab_link_prediction_for faloutsos.pptx
PanagiotisSymeonidis1
 
PPTX
Chapter 10 link prediction
AbanobZakaria1
 
PDF
An experimental evaluation of similarity-based and embedding-based link predi...
IJDKP
 
PDF
An Experimental Evaluation of Similarity-Based and Embedding-Based Link Predi...
IJDKP
 
PDF
A Survey On Link Prediction In Social Networks
April Smith
 
PPTX
240715_JW_labseminar[metapath2vec: Scalable Representation Learning for Heter...
thanhdowork
 
PDF
Supervised-Learning Link Recommendation in the DBLP co-authoring network
Universidade de São Paulo
 
PDF
An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...
IOSR Journals
 
PDF
Distributed Link Prediction in Large Scale Graphs using Apache Spark
Anastasios Theodosiou
 
PDF
Social Networks
Svitlana volkova
 
IRJET- Link Prediction in Social Networks
IRJET Journal
 
Link Prediction Survey
Patrick Walter
 
Predicting_new_friendships_in_social_networks
Anvardh Nanduri
 
Link prediction
Carlos Castillo (ChaTo)
 
Final Report
Nithin Kakkireni
 
Predicting Communication Intention in Social Media
Charalampos Chelmis
 
An experimental evaluation of similarity-based and embedding-based link predi...
IJDKP
 
Predicting Anchor Links between Heterogeneous Social Networks
Sina Sajadmanesh
 
IRJET- A Survey on Link Prediction Techniques
IRJET Journal
 
LPCNN: convolutional neural network for link prediction based on network stru...
TELKOMNIKA JOURNAL
 
Delab_link_prediction_for faloutsos.pptx
PanagiotisSymeonidis1
 
Chapter 10 link prediction
AbanobZakaria1
 
An experimental evaluation of similarity-based and embedding-based link predi...
IJDKP
 
An Experimental Evaluation of Similarity-Based and Embedding-Based Link Predi...
IJDKP
 
A Survey On Link Prediction In Social Networks
April Smith
 
240715_JW_labseminar[metapath2vec: Scalable Representation Learning for Heter...
thanhdowork
 
Supervised-Learning Link Recommendation in the DBLP co-authoring network
Universidade de São Paulo
 
An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...
IOSR Journals
 
Distributed Link Prediction in Large Scale Graphs using Apache Spark
Anastasios Theodosiou
 
Social Networks
Svitlana volkova
 
Ad

Link Prediction in (Partially) Aligned Heterogeneous Social Networks

  • 1. Link Prediction in (Partially) Aligned Heterogeneous Social Networks By: Sina Sajadmanesh Advisor: Dr. Hamid Reza Rabiee
  • 2. 2 DMLLink Prediction DMLDMLLink Prediction2 Outline Introduction  Problem Formulation  Applications Link Prediction in Homogeneous Networks Link Prediction in Heterogeneous Networks Link Prediction in Aligned Networks Future Works
  • 3. 3 DMLLink Prediction DMLDMLLink Prediction3 Introduction Problem  Based on a snapshot of network, predicting the set of potential links to be formed in the future is formally defined as Link Prediction Problem.  First proposed by Liben- Novel and Klienberg in CIKM 2003
  • 4. 4 DMLLink Prediction DML4 Applications Social networks and E-commerce  Recommender systems  Friend recommendation  Bioinformatics  Prioritization of candidate disease genes  Drug discovery  Security  Identify missing links between criminals  Controlling computer viruses
  • 5. 5 DMLLink Prediction DMLDMLLink Prediction5 Outline Introduction Link Prediction in Homogeneous Networks  Unsupervised methods  Supervised methods Link Prediction in Heterogeneous Networks Link Prediction in Aligned Networks Future Works
  • 6. 6 DMLLink Prediction DMLDMLLink Prediction6 Link Prediction in Homogeneous Networks Homogeneous Network 𝐺 = (𝑉, 𝐸)  If V contains one single type nodes and E contains one single type of links, then G is a homogeneous network
  • 7. 7 DMLLink Prediction DMLDMLLink Prediction7 Link Prediction in Homogeneous Networks Unsupervised methods  Measuring the closeness among nodes  Assuming that close nodes are more likely to be connected Unsupervised Link Predicators  Local neighbor based predicators  Path based predicators  Random-Walk based predicators
  • 8. 8 DMLLink Prediction DMLDMLLink Prediction8 Unsupervised Methods Local Neighbor based Link Predicators  Preferential Attachment 𝑃𝐴 𝑢, 𝑣 = Γ(𝑢) Γ(𝑣) user 𝑢 neighbor Γ(𝑢)
  • 9. 9 DMLLink Prediction DMLDMLLink Prediction9 Unsupervised Methods Local Neighbor based Link Predicators  Common Neighbor 𝐶𝑁 𝑢, 𝑣 = Γ(𝑢) ∩ Γ(𝑣) user 𝑢 neighbor Γ(𝑢)
  • 10. 10 DMLLink Prediction DMLDMLLink Prediction10 Unsupervised Methods Local Neighbor based Link Predicators  Jaccard Coefficient 𝐽𝐶 𝑢, 𝑣 = Γ(𝑢) ∩ Γ(𝑣) Γ(𝑢) ∪ Γ(𝑣)  Adamic/Adar 𝐴𝐴 𝑢, 𝑣 = 𝑤∈(Γ(𝑢)∩Γ(𝑣)) 1 log( Γ(𝑤) )  Resource Allocation 𝑅𝐴 𝑢, 𝑣 = 𝑤∈(Γ(𝑢)∩Γ(𝑣)) 1 Γ(𝑤)
  • 11. 11 DMLLink Prediction DMLDMLLink Prediction11 Unsupervised Methods Path based Link Predicators  Shortest Path 𝑆𝑃 𝑢, 𝑣 = min{ 𝑃𝑢↝𝑣 }  Katz Score 𝐾𝑎𝑡𝑧 𝑢, 𝑣 = 𝑙=1 ∞ 𝛽 𝑙 𝑃𝑢↝𝑣 𝑙 = (𝐼 − 𝛽𝐴)−1−𝐼
  • 12. 12 DMLLink Prediction DMLDMLLink Prediction12 Unsupervised Methods Random-Walk based Link Predicators  Hitting Time 𝐻𝑇 𝑢, 𝑣 = 1 + 𝑤∈Γ(𝑢) 𝑃𝑢,𝑤 𝐻𝑇(𝑤, 𝑣)  Commute Time 𝐶𝑇 𝑢, 𝑣 = 𝐻𝑇 𝑢, 𝑣 + 𝐻𝑇(𝑣, 𝑢)
  • 13. 13 DMLLink Prediction DMLDMLLink Prediction13 Link Prediction in Homogeneous Networks Other Unsupervised Methods  Matrix Factorization methods  Maximum Likelihood Methods  Probabilistic Methods
  • 14. 14 DMLLink Prediction DMLDMLLink Prediction14 Link Prediction in Homogeneous Networks Supervised Link Prediction  Learning a binary classifier that will predict whether a link exists between a given pair of nodes or not.  First proposed by Hassan et. al in SDM 2006
  • 15. 15 DMLLink Prediction DMLDMLLink Prediction15 Supervised Link Prediction Dataset  We need two snapshot of the network for training Positive Samples  Links that are missing in former snapshot, but are formed in latter snapshot Negative Samples  Links that are missing in both snapshots
  • 16. 16 DMLLink Prediction DMLDMLLink Prediction16 Supervised Link Prediction Features  Topological Features  Attribute-based Features Classifiers  SVM  Decision Trees  Multilayer Perceptron  KNN  Naïve Bayes
  • 17. 17 DMLLink Prediction DMLDMLLink Prediction17 Outline Introduction Link Prediction in Homogeneous Networks Link Prediction in Heterogeneous Networks  Relationship Prediction  Collective Link Prediction Link Prediction in Aligned Networks Future Works
  • 18. 18 DMLLink Prediction DMLDMLLink Prediction18 Link Prediction in Heterogeneous Networks Heterogeneous Network 𝐺 = (𝑉, 𝐸)  𝑉 = 𝑖 𝑉𝑖 is the sets of various kinds of nodes and 𝑉𝑖 is the 𝑖 𝑡ℎ kind of nodes in G  𝐸 = 𝑗 𝐸𝑗 is the sets of various types of links and 𝐸𝑗 is the 𝑗𝑡ℎ kind of links in G
  • 19. 19 DMLLink Prediction DMLDMLLink Prediction19 Link Prediction in Heterogeneous Networks Heterogeneous Network Schema
  • 20. 20 DMLLink Prediction DMLDMLLink Prediction20 Link Prediction in Heterogeneous Networks From Link Prediction to Relationship Prediction  A relationship between two objects could be a composition of two or more links • E.g. two authors have a co-author relationship if they have co-written a paper  Need to redesign topological features in heterogeneous networks
  • 21. 21 DMLLink Prediction DMLDMLLink Prediction21 Supervised Relationship Prediction Feature Extraction  Heterogeneous Features • Based on heterogeneous structure of network  Meta-Path based Features • Uses the concepts of meta-paths • Different meta paths represent different semantic meanings • Number of path instanced of a meta-path Φ
  • 22. 22 DMLLink Prediction DMLDMLLink Prediction22 Supervised Relationship Prediction Case Study  Social Link Prediction
  • 23. 23 DMLLink Prediction DMLDMLLink Prediction23 Supervised Relationship Prediction Heterogeneous Features  Social Features • Common Neighbor • Jaccard Coefficient • Adamic/Adar  Spatial Features • Common Locations • Jaccard Coefficient of Common Locations • Average Geographic distance of locations • …
  • 24. 24 DMLLink Prediction DMLDMLLink Prediction24 Supervised Relationship Prediction Heterogeneous Features  Temporal Features • Let T(u) be 24 hour activity vector of user u • Inner product of T(u) and T(v) • Cosine similarity of T(u) and T(v)  Text Content Features • Let w(u) be the bag-of-words vector of user u weighted by TF-IDF • Inner product of w(u) and w(v) • Cosine similarity of w(u) and w(v)
  • 25. 25 DMLLink Prediction DMLDMLLink Prediction25 Supervised Relationship Prediction Meta-Path based Features  Φ1: Follower of Follower • UUU  Φ2: Common Out Neighbor • UUU  Φ3: Common In Neighbor • UUU  Φ4: Common Words • UPWPU  Φ5: Common Timestamps • UPTPU  Φ6: Common Location Check-ins • UPLPU
  • 26. 26 DMLLink Prediction DMLDMLLink Prediction26 Link Prediction in Heterogeneous Networks Collective Link Prediction  Conventional link prediction approaches assume that links are independent identically distributed (i.i.d).  But in heterogeneous networks, different type of links are correlated and mutually influential.
  • 27. 27 DMLLink Prediction DMLDMLLink Prediction27 Link Prediction in Heterogeneous Networks Input Social Network
  • 28. 28 DMLLink Prediction DMLDMLLink Prediction28 Link Prediction in Heterogeneous Networks Independent Social and Location Link Prediction
  • 29. 29 DMLLink Prediction DMLDMLLink Prediction29 Link Prediction in Heterogeneous Networks Traditional Link Prediction 𝑌𝑠 = 𝑎𝑟𝑔 max 𝑌𝑠 𝑃(𝑦 𝐿 𝑠 = 𝑌𝑠) 𝑌𝑙 = 𝑎𝑟𝑔 max 𝑌 𝑙 𝑃(𝑦 𝐿𝑙 = 𝑌𝑙)  𝐿 𝑠 and 𝐿𝑙 are the sets of potential social and location links  𝑃 𝑦 𝐿 𝑠 = 𝑌𝑠 is the probability scores achieved when links in 𝐿 𝑠 are assigned with Labels 𝑌𝑠  𝑌𝑠 and 𝑌𝑙 are the sets of optimal labels
  • 30. 30 DMLLink Prediction DMLDMLLink Prediction30 Link Prediction in Heterogeneous Networks Collective Link Prediction 𝑌𝑠, 𝑌𝑙 = 𝑎𝑟𝑔 max 𝑌𝑠,𝑌 𝑙 𝑃(𝑦 𝐿 𝑠 = 𝑌𝑠|𝑦 𝐿𝑙 = 𝑌𝑙) × 𝑃(𝑦 𝐿𝑙 = 𝑌𝑙|𝑦 𝐿 𝑠 = 𝑌𝑠) 𝑌𝑠 (𝑡) = 𝑎𝑟𝑔 max 𝑌𝑠 𝑃(𝑦 𝐿 𝑠 = 𝑌𝑠 | 𝑦 𝐿 𝑠 = 𝑌𝑠 𝑡−1 , 𝑦 𝐿𝑙 = 𝑌𝑙 𝑡−1 ) 𝑌𝑙 (𝑡) = 𝑎𝑟𝑔 max 𝑌 𝑙 𝑃(𝑦 𝐿𝑙 = 𝑌𝑙 | 𝑦 𝐿 𝑠 = 𝑌𝑠 𝑡 , 𝑦 𝐿𝑙 = 𝑌𝑙 𝑡−1 ) IterativeSolution
  • 31. 31 DMLLink Prediction DMLDMLLink Prediction31 Outline Introduction Link Prediction in Homogeneous Networks Link Prediction in Heterogeneous Networks Link Prediction in Aligned Networks  Link Transfer  Anchor Link Inference Future Works
  • 32. 32 DMLLink Prediction DMLDMLLink Prediction32 Link Prediction in Aligned Networks Multi Aligned Heterogeneous Social Networks
  • 33. 33 DMLLink Prediction DMLDMLLink Prediction33 Link Prediction in Aligned Networks Link Transfer  Information Sparsity • New Network Problem • Cold Start Problem  Need to transfer knowledge from another domain • Transfer Learning • Use aligned network as a source  Context • Fully aligned networks • Partially aligned networks
  • 34. 34 DMLLink Prediction DMLDMLLink Prediction34 Link Prediction in Aligned Networks Link Transfer across Fully Aligned Networks  Prediction using only the target network: 𝑃 𝑦 𝑢 𝑡, 𝑣 𝑡 = 1|𝐺 𝑡 = 𝑃 𝑦 𝑢 𝑡, 𝑣 𝑡 = 1|𝑥 𝑢 𝑡, 𝑣 𝑡  Prediction using source and target network: 𝑃 𝑦 𝑢 𝑡, 𝑣 𝑡 = 1|𝐺 𝑡, 𝐺 𝑠 = 𝑃 𝑦 𝑢 𝑡 , 𝑣 𝑡 = 1| 𝑥 𝑢 𝑡 , 𝑣 𝑡 𝑇 , 𝑥 𝑢 𝑠 , 𝑣 𝑠 𝑇 , 𝑦 𝑢 𝑠 , 𝑣 𝑠 𝑇
  • 35. 35 DMLLink Prediction DMLDMLLink Prediction35 Link Prediction in Aligned Networks Link Transfer across Partially Aligned Networks  Solution: Inter-Network Meta-Paths  Let 𝛾 𝑈 𝑡, 𝑈 𝑠 be the anchor meta-path  Ψ1: 𝛾 𝑈 𝑡, 𝑈 𝑠 − Φ 𝑈 𝑠, 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡  Ψ2: Φ 𝑈 𝑡, 𝑈 𝑡 − 𝛾 𝑈 𝑡, 𝑈 𝑠 − Φ 𝑈 𝑠, 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡  Ψ3: 𝛾 𝑈 𝑡, 𝑈 𝑠 − Φ 𝑈 𝑠, 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡 − Φ 𝑈 𝑡, 𝑈 𝑡  Ψ3: Φ 𝑈 𝑡 , 𝑈 𝑡 − 𝛾 𝑈 𝑡 , 𝑈 𝑠 − Φ 𝑈 𝑠 , 𝑈 𝑠 − 𝛾 𝑈 𝑠, 𝑈 𝑡 − Φ 𝑈 𝑡, 𝑈 𝑡
  • 36. 36 DMLLink Prediction DMLDMLLink Prediction36 Link Prediction in Aligned Networks Anchor Link Inference
  • 37. 37 DMLLink Prediction DMLDMLLink Prediction37 Anchor Link Inference Supervised Method  Social Features • Extended Common Neighbors • Extended Jaccard Coefficient • Extended Adamic/Adar  Spatial Features  Temporal Features  Text Content Features
  • 38. 38 DMLLink Prediction DMLDMLLink Prediction38 Anchor Link Inference Inference w.r.t One-to-One Constraint  Predicted Scores
  • 39. 39 DMLLink Prediction DMLDMLLink Prediction39 Anchor Link Inference Inference w.r.t One-to-One Constraint  Conventional Link Prediction
  • 40. 40 DMLLink Prediction DMLDMLLink Prediction40 Anchor Link Inference Inference w.r.t One-to-One Constraint  Max sum of scores
  • 41. 41 DMLLink Prediction DMLDMLLink Prediction41 Anchor Link Inference Inference w.r.t One-to-One Constraint  Stable Matching
  • 42. 42 DMLLink Prediction DMLDMLLink Prediction42 Outline Introduction Link Prediction in Homogeneous Networks Link Prediction in Heterogeneous Networks Link Prediction in Aligned Networks Future Works
  • 43. 43 DMLLink Prediction DMLDMLLink Prediction43 Future Works Anchor Link Formation Prediction  Predicting whether a user in the source network will join the target network in the future  This depends on the amount of influence he/she receives from the target network  An influence model can be learned using the training data  Positive-Unlabeled learning can improve the prediction performance
  • 44. 44 DMLLink Prediction DMLDMLLink Prediction44 References [1] L. Lü and T. Zhou, "Link prediction in complex networks: A survey," Physica A: Statistical Mechanics and its Applications, vol. 390, pp. 1150-1170, 2011. [2] M. Al Hasan, V. Chaoji, S. Salem, and M. Zaki, "Link prediction using supervised learning," in SDM’06: Workshop on Link Analysis, Counter-terrorism and Security, 2006. [3] J. Zhang and S. Y. Philip, "Link Prediction across Heterogeneous Social Networks: A Survey," 2014. [4] Y. Sun, R. Barber, M. Gupta, C. C. Aggarwal, and J. Han, "Co-author relationship prediction in heterogeneous bibliographic networks," in Advances in Social Networks Analysis and Mining (ASONAM), 2011 International Conference on, 2011, pp. 121-128. [5] X. Kong, J. Zhang, and P. S. Yu, "Inferring anchor links across multiple heterogeneous social networks," in Proceedings of the 22nd ACM international conference on Conference on information & knowledge management, 2013, pp. 179- 188.
  • 45. Q&A