SlideShare a Scribd company logo
Are Meta-Paths Necessary?
Revisiting Heterogeneous Graph Embeddings
Rana Hussein, Dingqi Yang and Philippe Cudré-Mauroux
eXascale Infolab, University of Fribourg, Switzerland
27th ACM International Conference on Information and Knowledge Management (CIKM 2018)
Graph Embeddings
• Represent nodes in a graph using a vector space.
• Learn a latent space representation of the graph structure and node interactions.
• Community detection
• Friendship recommendation
• User interest prediction
2Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international
conference on Knowledge discovery and data mining. ACM, 701–710.
Graph Embeddings Techniques
• One of the typical approaches is Random Walk + SkipGram like model.
3
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international
conference on Knowledge discovery and data mining. ACM, 701–710.
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
Heterogeneous Graphs
• Heterogeneous Graphs contain multiple node types:
• Homogeneous edges: linking nodes from the same domain
• Heterogeneous edges: linking nodes across different domains
4
• The proximity among nodes is based on semantics.
Heterogeneous Graph embeddings
• A meta-path is a sequence of node types encoding key composite relations among the involved
node types.
• Meta-paths are used to guide random walks to redefine the neighborhood of a node.
• Metapath2vec (KDD 2017)
5Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 135–144.
Challenges
• How to select meta-paths ?
• Graph specific and highly depends on prior knowledge from domain experts.
• Strategies to combine a set of meta-paths can be complex and computationally expensive.
• The choice of meta-paths highly affects the quality of the learnt node embeddings for a specific
task.
6
Are meta-paths necessary?
7
8
• We propose a two level graph embeddings technique for HIN:
• Step 1: Random Walk with JUmp and STay strategies to probabilistically
control the random walk.
• Step 2: Learn node embeddings with SkipGram model.
JUST - Heterogeneous Graph Embeddings technique
Random Walk with JUmp and STay strategies (JUST)
1- Jump or stay?
9
• Objective: Balance the number of heterogeneous and
homogeneous edges traversed during random walks.
• α ∈ [0, 1] is an initial stay probability.
• refers to the number of nodes consecutively visited in the same domain.
Random Walk with JUmp and STay strategies (JUST)
2- Where to Jump?
10
• Objective: Control the randomness in choosing a target domain.
• Define a fixed length queue Qhist to memorize up-to-m previously
visited domains.
• For each node in the graph, we initialize a random walk, until the maximum
length is reached.
• Maximize the co-occurance probability of two nodes appearing within a context
window in the random walk using SkipGram model.
11
Random Walk with JUmp and STay strategies (JUST)
Experimental evaluation - Datasets
DBLP Movie Foursquare
12
Experimental evaluation - Baselines
• Homogeneous graph embedding techniques:
• Deepwalk
• LINE
• Heterogeneous graph embedding techniques :
• PTE
• Metapath2vec
• Hin2vec
• JUST_no_memory (simplified version of our proposed method)
13
Node classification results
14
JUST achieves state of the art performance, and outperforms the baselines.
Node clustering results
15
JUST outperforms the baselines on all datasets.
Combining several meta-paths may not consistently outperform manually selecting one meta-path.
DeepWalk LINE Hin2vecPTE Metapath2vec JUSTJUST_no_memory
DBLP MovieFoursquare
Impact of initial stay probability α
16
• Balances the impact of heterogeneous and homogeneous edges on the learnt embeddings.
• Tune α within [0.1,0.9] with a step of 0.1
Suboptimal results for too many heterogeneous or homogeneous edges.
Balancing the number of edges is key to learn high quality embeddings.
The optimal α lies in the range [0.2,0.4] on all three datasets in both node classification and clustering tasks.
Runtime Performance
• End-to-end node embedding learning time for all random-walk based
methods in seconds.
17
DBLP Movie Foursquare
DeepWalk 236 333 484
Metapath2vec (original) 965 19,200 2,248
Metapath2vec (ours) 290 408 550
Hin2vec 904 1,301 1,801
JUST 310 442 616
• Compared to DeepWalk and Metapath2vec, JUST has minor overhead on learning time,
but achieves better results in classification and clustering tasks.
• Compared to Hin2vec, JUST achieves 3x speedup learning time, and achieves better
results in most experiments.
Conclusions
• Propose JUST, a heterogeneous graph embedding technique using random
walks with jump and stay strategies without prior knowledge.
• JUST achieves state of the art performance without using meta-paths for
classification and clustering tasks.
18
• We plan to investigate how JUST performs on different graph structures, such as:
Knowledge Graphs.

More Related Content

PDF
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
Christopher Morris
 
PDF
KNIME tutorial
George Papadatos
 
PPTX
A survey on graph kernels
vincyy
 
PPTX
NS-CUK Joint Journal Club: S.T.Nguyen, Review on “Cluster-GCN: An Efficient A...
ssuser4b1f48
 
PDF
Introduction to Knowledge Graphs and Semantic AI
Semantic Web Company
 
PPTX
Positive and Negative Relationship
SaeidGhasemshirazi
 
PDF
Adversarial Attacks on A.I. Systems — NextCon, Jan 2019
anant90
 
PDF
Graph convolutional networks in apache spark
Emiliano Martinez Sanchez
 
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
Christopher Morris
 
KNIME tutorial
George Papadatos
 
A survey on graph kernels
vincyy
 
NS-CUK Joint Journal Club: S.T.Nguyen, Review on “Cluster-GCN: An Efficient A...
ssuser4b1f48
 
Introduction to Knowledge Graphs and Semantic AI
Semantic Web Company
 
Positive and Negative Relationship
SaeidGhasemshirazi
 
Adversarial Attacks on A.I. Systems — NextCon, Jan 2019
anant90
 
Graph convolutional networks in apache spark
Emiliano Martinez Sanchez
 

What's hot (20)

PDF
Training Week: Create a Knowledge Graph: A Simple ML Approach
Neo4j
 
PPTX
Graph Representation Learning
Jure Leskovec
 
PDF
Graph Neural Networks for Recommendations
WQ Fan
 
PDF
XGBoost & LightGBM
Gabriel Cypriano Saca
 
PPTX
Recommendation system
Ding Li
 
PDF
Knowledge Graphs & Graph Data Science, More Context, Better Predictions - Neo...
Neo4j
 
PDF
Gnn overview
Louis (Yufeng) Wang
 
PPTX
Data Con LA 2022 - Making real-time analytics a reality for digital transform...
Data Con LA
 
PPTX
Pre trained language model
JiWenKim
 
PPTX
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Preferred Networks
 
PDF
Hyperparameter Optimization for Machine Learning
Francesco Casalegno
 
PDF
Knowledge Graph Embeddings for Recommender Systems
Enrico Palumbo
 
PDF
Graph Kernelpdf
pratik shukla
 
PPTX
Generative Adversarial Networks (GAN)
Manohar Mukku
 
PPTX
Adversarial Attacks and Defense
Kishor Datta Gupta
 
PDF
Azure Cognitive Services for Developers
Marvin Heng
 
PDF
Graph Neural Network in practice
tuxette
 
PDF
Demystifying Xgboost
halifaxchester
 
PDF
Introduction of Knowledge Graphs
Jeff Z. Pan
 
PDF
Migrating Apache Hive Workload to Apache Spark: Bridge the Gap with Zhan Zhan...
Databricks
 
Training Week: Create a Knowledge Graph: A Simple ML Approach
Neo4j
 
Graph Representation Learning
Jure Leskovec
 
Graph Neural Networks for Recommendations
WQ Fan
 
XGBoost & LightGBM
Gabriel Cypriano Saca
 
Recommendation system
Ding Li
 
Knowledge Graphs & Graph Data Science, More Context, Better Predictions - Neo...
Neo4j
 
Gnn overview
Louis (Yufeng) Wang
 
Data Con LA 2022 - Making real-time analytics a reality for digital transform...
Data Con LA
 
Pre trained language model
JiWenKim
 
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Preferred Networks
 
Hyperparameter Optimization for Machine Learning
Francesco Casalegno
 
Knowledge Graph Embeddings for Recommender Systems
Enrico Palumbo
 
Graph Kernelpdf
pratik shukla
 
Generative Adversarial Networks (GAN)
Manohar Mukku
 
Adversarial Attacks and Defense
Kishor Datta Gupta
 
Azure Cognitive Services for Developers
Marvin Heng
 
Graph Neural Network in practice
tuxette
 
Demystifying Xgboost
halifaxchester
 
Introduction of Knowledge Graphs
Jeff Z. Pan
 
Migrating Apache Hive Workload to Apache Spark: Bridge the Gap with Zhan Zhan...
Databricks
 
Ad

Similar to Cikm 2018 (20)

PPTX
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....
thanhdowork
 
PPTX
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
thanhdowork
 
PPTX
An Efficient Parallel Algorithm for Secured Data Communication Using RSA Publ...
Harshal Solao
 
PDF
Representation Learning on Complex Graphs
eXascale Infolab
 
PPTX
[20240513_LabSeminar_Huy]GraphFewShort_Transfer.pptx
thanhdowork
 
PDF
00 Automatic Mental Health Classification in Online Settings and Language Emb...
Duke Network Analysis Center
 
PPTX
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
thanhdowork
 
PPTX
[20240318_LabSeminar_Huy]GSTNet: Global Spatial-Temporal Network for Traffic ...
thanhdowork
 
PPTX
240715_JW_labseminar[metapath2vec: Scalable Representation Learning for Heter...
thanhdowork
 
PDF
[IJET V2I3P11] Authors: Payal More, Rohini Pandit, Supriya Makude, Harsh Nirb...
IJET - International Journal of Engineering and Techniques
 
PPTX
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...
ssuser4b1f48
 
PDF
Throttling Malware Families in 2D
Mohamed Nassar
 
PDF
Only Time Will Tell: Modelling Information Diffusion in Code Review with Time...
Michael Dorner
 
PPTX
[20240930_LabSeminar_Huy]GinAR: An End-To-End Multivariate Time Series Foreca...
thanhdowork
 
PDF
Ling liu part 01:big graph processing
jins0618
 
PPTX
[NS][Lab_Seminar_240710]Improving Graph Networks through Selection-based Conv...
thanhdowork
 
PPTX
240722_Thuy_Labseminar[Unveiling Global Interactive Patterns across Graphs: T...
thanhdowork
 
PDF
2019 cvpr paper overview by Ho Seong Lee
Moazzem Hossain
 
PDF
2019 cvpr paper_overview
LEE HOSEONG
 
PPTX
mini project_shortest path visualizer.pptx
tusharpawar803067
 
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....
thanhdowork
 
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
thanhdowork
 
An Efficient Parallel Algorithm for Secured Data Communication Using RSA Publ...
Harshal Solao
 
Representation Learning on Complex Graphs
eXascale Infolab
 
[20240513_LabSeminar_Huy]GraphFewShort_Transfer.pptx
thanhdowork
 
00 Automatic Mental Health Classification in Online Settings and Language Emb...
Duke Network Analysis Center
 
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
thanhdowork
 
[20240318_LabSeminar_Huy]GSTNet: Global Spatial-Temporal Network for Traffic ...
thanhdowork
 
240715_JW_labseminar[metapath2vec: Scalable Representation Learning for Heter...
thanhdowork
 
[IJET V2I3P11] Authors: Payal More, Rohini Pandit, Supriya Makude, Harsh Nirb...
IJET - International Journal of Engineering and Techniques
 
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...
ssuser4b1f48
 
Throttling Malware Families in 2D
Mohamed Nassar
 
Only Time Will Tell: Modelling Information Diffusion in Code Review with Time...
Michael Dorner
 
[20240930_LabSeminar_Huy]GinAR: An End-To-End Multivariate Time Series Foreca...
thanhdowork
 
Ling liu part 01:big graph processing
jins0618
 
[NS][Lab_Seminar_240710]Improving Graph Networks through Selection-based Conv...
thanhdowork
 
240722_Thuy_Labseminar[Unveiling Global Interactive Patterns across Graphs: T...
thanhdowork
 
2019 cvpr paper overview by Ho Seong Lee
Moazzem Hossain
 
2019 cvpr paper_overview
LEE HOSEONG
 
mini project_shortest path visualizer.pptx
tusharpawar803067
 
Ad

More from eXascale Infolab (20)

PDF
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
eXascale Infolab
 
PPTX
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
eXascale Infolab
 
PPTX
A force directed approach for offline gps trajectory map
eXascale Infolab
 
PPTX
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
eXascale Infolab
 
PDF
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
eXascale Infolab
 
PDF
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
eXascale Infolab
 
PDF
Crowd scheduling www2016
eXascale Infolab
 
PPTX
SANAPHOR: Ontology-based Coreference Resolution
eXascale Infolab
 
PDF
Efficient, Scalable, and Provenance-Aware Management of Linked Data
eXascale Infolab
 
PDF
Entity-Centric Data Management
eXascale Infolab
 
PDF
SSSW 2015 Sense Making
eXascale Infolab
 
PDF
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
eXascale Infolab
 
PDF
Executing Provenance-Enabled Queries over Web Data
eXascale Infolab
 
PDF
The Dynamics of Micro-Task Crowdsourcing
eXascale Infolab
 
PDF
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
eXascale Infolab
 
PPTX
CIKM14: Fixing grammatical errors by preposition ranking
eXascale Infolab
 
PDF
OLTP-Bench
eXascale Infolab
 
PPTX
An Introduction to Big Data
eXascale Infolab
 
PPTX
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
eXascale Infolab
 
PPTX
Hasler2014
eXascale Infolab
 
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
eXascale Infolab
 
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
eXascale Infolab
 
A force directed approach for offline gps trajectory map
eXascale Infolab
 
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
eXascale Infolab
 
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
eXascale Infolab
 
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
eXascale Infolab
 
Crowd scheduling www2016
eXascale Infolab
 
SANAPHOR: Ontology-based Coreference Resolution
eXascale Infolab
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
eXascale Infolab
 
Entity-Centric Data Management
eXascale Infolab
 
SSSW 2015 Sense Making
eXascale Infolab
 
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
eXascale Infolab
 
Executing Provenance-Enabled Queries over Web Data
eXascale Infolab
 
The Dynamics of Micro-Task Crowdsourcing
eXascale Infolab
 
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
eXascale Infolab
 
CIKM14: Fixing grammatical errors by preposition ranking
eXascale Infolab
 
OLTP-Bench
eXascale Infolab
 
An Introduction to Big Data
eXascale Infolab
 
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
eXascale Infolab
 
Hasler2014
eXascale Infolab
 

Recently uploaded (20)

PDF
Technical Writing Module-I Complete Notes.pdf
VedprakashArya13
 
PPTX
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
PPTX
Power BI in Business Intelligence with AI
KPR Institute of Engineering and Technology
 
PPTX
short term internship project on Data visualization
JMJCollegeComputerde
 
PPTX
Future_of_AI_Presentation for everyone.pptx
boranamanju07
 
PDF
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 
PPTX
Introduction-to-Python-Programming-Language (1).pptx
dhyeysapariya
 
PPTX
Employee Salary Presentation.l based on data science collection of data
barridevakumari2004
 
PDF
Chad Readey - An Independent Thinker
Chad Readey
 
PPTX
Multiscale Segmentation of Survey Respondents: Seeing the Trees and the Fores...
Sione Palu
 
PPTX
short term project on AI Driven Data Analytics
JMJCollegeComputerde
 
PDF
An Uncut Conversation With Grok | PDF Document
Mike Hydes
 
PPT
Grade 5 PPT_Science_Q2_W6_Methods of reproduction.ppt
AaronBaluyut
 
PPTX
Data Security Breach: Immediate Action Plan
varmabhuvan266
 
PDF
Company Presentation pada Perusahaan ADB.pdf
didikfahmi
 
PPTX
Introduction to Data Analytics and Data Science
KavithaCIT
 
PPTX
Presentation (1) (1).pptx k8hhfftuiiigff
karthikjagath2005
 
PPTX
Complete_STATA_Introduction_Beginner.pptx
mbayekebe
 
PPTX
White Blue Simple Modern Enhancing Sales Strategy Presentation_20250724_21093...
RamNeymarjr
 
PPTX
INFO8116 - Week 10 - Slides.pptx big data architecture
guddipatel10
 
Technical Writing Module-I Complete Notes.pdf
VedprakashArya13
 
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
Power BI in Business Intelligence with AI
KPR Institute of Engineering and Technology
 
short term internship project on Data visualization
JMJCollegeComputerde
 
Future_of_AI_Presentation for everyone.pptx
boranamanju07
 
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 
Introduction-to-Python-Programming-Language (1).pptx
dhyeysapariya
 
Employee Salary Presentation.l based on data science collection of data
barridevakumari2004
 
Chad Readey - An Independent Thinker
Chad Readey
 
Multiscale Segmentation of Survey Respondents: Seeing the Trees and the Fores...
Sione Palu
 
short term project on AI Driven Data Analytics
JMJCollegeComputerde
 
An Uncut Conversation With Grok | PDF Document
Mike Hydes
 
Grade 5 PPT_Science_Q2_W6_Methods of reproduction.ppt
AaronBaluyut
 
Data Security Breach: Immediate Action Plan
varmabhuvan266
 
Company Presentation pada Perusahaan ADB.pdf
didikfahmi
 
Introduction to Data Analytics and Data Science
KavithaCIT
 
Presentation (1) (1).pptx k8hhfftuiiigff
karthikjagath2005
 
Complete_STATA_Introduction_Beginner.pptx
mbayekebe
 
White Blue Simple Modern Enhancing Sales Strategy Presentation_20250724_21093...
RamNeymarjr
 
INFO8116 - Week 10 - Slides.pptx big data architecture
guddipatel10
 

Cikm 2018

  • 1. Are Meta-Paths Necessary? Revisiting Heterogeneous Graph Embeddings Rana Hussein, Dingqi Yang and Philippe Cudré-Mauroux eXascale Infolab, University of Fribourg, Switzerland 27th ACM International Conference on Information and Knowledge Management (CIKM 2018)
  • 2. Graph Embeddings • Represent nodes in a graph using a vector space. • Learn a latent space representation of the graph structure and node interactions. • Community detection • Friendship recommendation • User interest prediction 2Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 701–710.
  • 3. Graph Embeddings Techniques • One of the typical approaches is Random Walk + SkipGram like model. 3 Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 701–710. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
  • 4. Heterogeneous Graphs • Heterogeneous Graphs contain multiple node types: • Homogeneous edges: linking nodes from the same domain • Heterogeneous edges: linking nodes across different domains 4 • The proximity among nodes is based on semantics.
  • 5. Heterogeneous Graph embeddings • A meta-path is a sequence of node types encoding key composite relations among the involved node types. • Meta-paths are used to guide random walks to redefine the neighborhood of a node. • Metapath2vec (KDD 2017) 5Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 135–144.
  • 6. Challenges • How to select meta-paths ? • Graph specific and highly depends on prior knowledge from domain experts. • Strategies to combine a set of meta-paths can be complex and computationally expensive. • The choice of meta-paths highly affects the quality of the learnt node embeddings for a specific task. 6
  • 8. 8 • We propose a two level graph embeddings technique for HIN: • Step 1: Random Walk with JUmp and STay strategies to probabilistically control the random walk. • Step 2: Learn node embeddings with SkipGram model. JUST - Heterogeneous Graph Embeddings technique
  • 9. Random Walk with JUmp and STay strategies (JUST) 1- Jump or stay? 9 • Objective: Balance the number of heterogeneous and homogeneous edges traversed during random walks. • α ∈ [0, 1] is an initial stay probability. • refers to the number of nodes consecutively visited in the same domain.
  • 10. Random Walk with JUmp and STay strategies (JUST) 2- Where to Jump? 10 • Objective: Control the randomness in choosing a target domain. • Define a fixed length queue Qhist to memorize up-to-m previously visited domains.
  • 11. • For each node in the graph, we initialize a random walk, until the maximum length is reached. • Maximize the co-occurance probability of two nodes appearing within a context window in the random walk using SkipGram model. 11 Random Walk with JUmp and STay strategies (JUST)
  • 12. Experimental evaluation - Datasets DBLP Movie Foursquare 12
  • 13. Experimental evaluation - Baselines • Homogeneous graph embedding techniques: • Deepwalk • LINE • Heterogeneous graph embedding techniques : • PTE • Metapath2vec • Hin2vec • JUST_no_memory (simplified version of our proposed method) 13
  • 14. Node classification results 14 JUST achieves state of the art performance, and outperforms the baselines.
  • 15. Node clustering results 15 JUST outperforms the baselines on all datasets. Combining several meta-paths may not consistently outperform manually selecting one meta-path. DeepWalk LINE Hin2vecPTE Metapath2vec JUSTJUST_no_memory DBLP MovieFoursquare
  • 16. Impact of initial stay probability α 16 • Balances the impact of heterogeneous and homogeneous edges on the learnt embeddings. • Tune α within [0.1,0.9] with a step of 0.1 Suboptimal results for too many heterogeneous or homogeneous edges. Balancing the number of edges is key to learn high quality embeddings. The optimal α lies in the range [0.2,0.4] on all three datasets in both node classification and clustering tasks.
  • 17. Runtime Performance • End-to-end node embedding learning time for all random-walk based methods in seconds. 17 DBLP Movie Foursquare DeepWalk 236 333 484 Metapath2vec (original) 965 19,200 2,248 Metapath2vec (ours) 290 408 550 Hin2vec 904 1,301 1,801 JUST 310 442 616 • Compared to DeepWalk and Metapath2vec, JUST has minor overhead on learning time, but achieves better results in classification and clustering tasks. • Compared to Hin2vec, JUST achieves 3x speedup learning time, and achieves better results in most experiments.
  • 18. Conclusions • Propose JUST, a heterogeneous graph embedding technique using random walks with jump and stay strategies without prior knowledge. • JUST achieves state of the art performance without using meta-paths for classification and clustering tasks. 18 • We plan to investigate how JUST performs on different graph structures, such as: Knowledge Graphs.