SlideShare a Scribd company logo
Accuracy Evaluation of Overlapping and
Multi-resolution Clustering Algorithms
on Large Datasets
IEEE BigComp 2019
Artem Lutov, Mourad Khayati and Philippe Cudré-Mauroux
eXascale Infolab, University of Fribourg, Switzerland
https://github.com/eXascaleInfolab/xmeasures
https://bit.ly/xmeasures-slides
Contributions
● Extension of the existent accuracy metrics
to increasetheirdiscriminativepower
● Optimization of the accuracy metrics calculation
to speedup the evaluation process
● Recommendations for the applicability of the accuracy metrics
considering interpretability of their values
2
Informal Objective
Generalized Evaluation of the ClusteringQuality
Generalized: for any (large and complex) clustering results (datasets)
Complex clustering results: multiplemembership of the elements
Quality: the extrinsic quality (accuracy) quantifiers the expectedresults
(comparing to the ground-truth) unlike the intrinsic quality
quantifying some statistical properties of the formed clusters
3
Complex Clustering Results(Multiple Membership)
44
Racing cars
Overlapping Clusters Clusters on Various Resolutions
Blue cars
Jeeps
Cars
Racing cars
Bikes
Racing &
blue cars
Bikes
Matching the clusterings (unordered sets of elements) even with the
elements having a single membership may yield multiplebestmatches:
=> Strictclusterslabeling is not
always possible and undesirable.
Many dedicated accuracy metrics
are designed but few of them are
applicable for the elements with
multiplemembership.
Accuracy Evaluation for Clusterings
5
Produced Ground-truth
Dark or Cyan?
Yellow
Our Requirements for the Accuracy Metrics
● Applicable for the elements having multiplemembership
● Applicable for Large Datasets: ideally O(N), runtime up to O(N2
)
Families with the accuracy metrics satisfying our requirements:
● Pair Counting Based Metrics: OmegaIndex [Collins,1988]
● Cluster Matching Based Metrics: AverageF1score [Yang,2013]
● Information Theory Based Metrics: GeneralizedNMI [Esquivel,2012]
Problem: accuracy values interpretability and the metric selection.
6
Interpretability of the Accuracy Values
The four formalconstraints identify aspects of the clusteringquality being
captured by different accuracy metrics:
● Homogeneity
● Completeness
● Rag Bag
● Size vs Quality (Micro Weighting)
7
Amigó et al., 2009
Rosenberg et al., 2007
Homogeneity
8
Clusters should not mix elements
belonging to different categories.
Low High* Figures are taken from Amigó et al., 2009
Completeness
9
Elements belonging to the same category
should be clustered together.
Low High* Figures are taken from Amigó et al., 2009
Rag Bag
10
Elements with low relevance to the
categories (e.g., noise) should be preferably
assigned to the less homogeneous clusters
(macro-scale, low-resolution, coarse-grained
or top-level clusters in a hierarchy). Low High
Size vs Quality (Micro Weighting)
11
Low High
A small assignment error in a large cluster
is preferable to a large number of similar
errors in small clusters.
Omega Index (Fuzzy ARI) [Collins,1988]
Omega Index (𝛀) counts the number of pairs of elements occurring in
exactly the same number of clusters as in the number of categories and
adjusted to the expected number of such pairs:
12
,
,
C’ - ground-truth
(categories)
C - produced cls.
Omega Index Counterexample
A counter-example for the Omega Index (highly overlapping clusters):
13
Ground-truth
C1’: 1 2 3
C2’: 2 3 4
C3’: 3 4 1
C4’: 4 1 2
Low
C1: 1 2
C2: 3 4
High
C1: 1 2
C2: 2 3
C3: 3 4
C4: 4 1
Omega Index 0 0
Soft Omega Index
Soft Omega Index take into account pairs present in different number
of clusters by normalizing smaller number of occurrences of each pair
of elements in all clusters of one clustering by the larger number of
occurrences in another clustering:
14
,
Omega Index vs Soft Omega Index
A counter-example for the Omega Index (highly overlapping clusters):
15
Ground-truth
C1’: 1 2 3
C2’: 2 3 4
C3’: 3 4 1
C4’: 4 1 2
Low
C1: 1 2
C2: 3 4
High
C1: 1 2
C2: 2 3
C3: 3 4
C4: 4 1
Low High
Omega 0 0
Soft Omega 0 0.33
Average F1 Score [Yang,2013]
F1a is defined as the average of the weighted F1 scores of a) the best
matching ground-truth clusters to the formed clusters and b) the best
matching formed clusters to the ground-truth clusters:
16
,
F1 - F1-measure
[Rijsbergen, 1974]
Mean F1 Scores: F1h
F1h uses harmonic instead of the arithm. mean to address F1a ≳ 0.5
for the clusters produced from all combinations of the nodes (F1C‘,C
=
1 since for each category there exists the exactly matching cluster,
F1C,C’
→0 since majority of the clusters have low similarity to the
categories):
17
, for the contribution m of the nodes:
F1p is the harmonic mean of the average over each clustering of the
best local probabilities (f1 ➞ pprob) for each cluster:
Mean F1 Scores: F1p
18
Purpose: O(N(|C’| + |C|)) ➞ O(N)
Cluster
mbs # Member nodes, const
cont #Members contrib, const
counter # Contribs counter
Counter
orig # Originating cluster
ctr # Raw counter, <= mbs
C
Indexing Technique for Mean F1 Score
19
..
...
a
for a in g2.mbs: for c in cls(C.a):
cc = c.counter;
if cc.orig != g2:
cc.ctr=0; cc.orig=g2
cc.ctr += 1 / |C.a| if ovp else 1
fmatch(cc.ctr, c.cont, g2.cont)
g2
c1
c3
C’
SNAP DBLP (Nodes: 317,080
Edges: 1,049,866 Clusters:
13,477) ground-truth vs
clustering by the Louvain.
Evaluation on Intel Xeon
E5-2620 (32 logical CPUs)
@ 2.10 GHz, apps compiled
using GCC 5.4 with -O3 flag.
Xmeasures MF1 vs ParallelComMetric F1-Measure
20
Xmeasures MF1 vs ParallelComMetric F1-Measure
Execution Time (sec)
21
Dataset  Eval App xmeasures @ 1 CPU pcomet @ 32 CPUs
DBLP (317K n, 13K c) 0.50 (90x speedup) 44.44
Youtube (1.13M, 16K) 0.53 (450x) 244.29
Amazon(335K, 272K) 1.84 (450x) 823.06
LiveJrn. (64M, 464K) 7.65 (> 4,100x) 31,608
Generalized Normalized Mutual Information (NMI)
NMI is Mutual Information I(C’:C) normalized by the max or mean
value of the unconditional entropy H of the clusterings C’, C:
22
,
,
GNMI
[Esquivel,2012]
uses a stochastic
process to
compute MI.
Generalized NMI Optimizations
● Dynamically evaluate the maximal number of stochastic events as:
● Apply weighed adaptive sampling instead of the fully random
sampling, where only the nodes co-occurring in the clusters are
sampled and weighed to discount the contribution (importance) of
frequent nodes as
23
Performance of GNMI Optimized vs Original
Accuracy
Also, GNMI optimized orders of magnitude faster than GNMIorig
.
24
Formal Constraints Satisfaction
* - empirically satisfies
+ - satisfiers
25
*
(Soft) 𝛀
MF1
GNMI
Metrics Applicability
26
O(N2
), performs purely for the
multi-resolution clusterings.
Evaluates the best-matching
clusters only (unfair advantage
for the larger clusters).
Biased to the number of clusters,
non-deterministic results, the
convergence is not guaranteed
in the stochastic
implementation.
values are not affected by the
number of clusters.
O(N), F1p satisfiers more
formal constraints than others.
Highly parallelized, evaluates
full matches, well-grounded
theoretically.
Q&A
Artem Lutov <artem.lutov@unifr.ch>
https://github.com/eXascaleInfolab/xmeasures
27
Supplementary Slides
28
Average F1 Score Corrected
F1a can be normalized by the total or max number of clusters to be
indicative for the large number of clusters:
But the indicativity is lost for the small number of clusters: for |C’| = 2
ground-truth clusters and intentionally formed single cluster |C| = 1,
F 1ax
(C’, C) ≳ 1/3.
29

More Related Content

What's hot (18)

PPTX
Matt cousensagrg presentation_final
COGS Presentations
 
PPTX
Gbm.more GBM in H2O
Sri Ambati
 
PDF
Report
Vartika Sharma
 
PPTX
Object recognition of CIFAR - 10
Ratul Alahy
 
PPTX
Enhancing the performance of kmeans algorithm
Hadi Fadlallah
 
PDF
5. 8519 1-pb
IAESIJEECS
 
PDF
Cubist
FAO
 
PDF
Lifelong learning for multi-task learning
Jeong-Gwan Lee
 
PDF
Many-Objective Performance Enhancement in Computing Clusters
Tarik Reza Toha
 
PPTX
06 how to write a map reduce version of k-means clustering
Subhas Kumar Ghosh
 
PPTX
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
MLconf
 
PDF
Lung Cancer Detection using Fuzzy Clustering and Artificial Neyron Network
Riya Tirole
 
PPTX
Intensity transformation & histogram processing
Dëèp Çhõkshï
 
PDF
Deep Learning Theory Seminar (Chap 1-2, part 1)
Sangwoo Mo
 
PPTX
Thesis Presentation
Reuben Feinman
 
PPTX
Kyle DiGirolamo octave project summary
Kyle DiGirolamo
 
PDF
第13回 配信講義 計算科学技術特論A(2021)
RCCSRENKEI
 
PDF
Audio tagging system using densely connected convolutional networks (DCASE201...
Hyun-gui Lim
 
Matt cousensagrg presentation_final
COGS Presentations
 
Gbm.more GBM in H2O
Sri Ambati
 
Object recognition of CIFAR - 10
Ratul Alahy
 
Enhancing the performance of kmeans algorithm
Hadi Fadlallah
 
5. 8519 1-pb
IAESIJEECS
 
Cubist
FAO
 
Lifelong learning for multi-task learning
Jeong-Gwan Lee
 
Many-Objective Performance Enhancement in Computing Clusters
Tarik Reza Toha
 
06 how to write a map reduce version of k-means clustering
Subhas Kumar Ghosh
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
MLconf
 
Lung Cancer Detection using Fuzzy Clustering and Artificial Neyron Network
Riya Tirole
 
Intensity transformation & histogram processing
Dëèp Çhõkshï
 
Deep Learning Theory Seminar (Chap 1-2, part 1)
Sangwoo Mo
 
Thesis Presentation
Reuben Feinman
 
Kyle DiGirolamo octave project summary
Kyle DiGirolamo
 
第13回 配信講義 計算科学技術特論A(2021)
RCCSRENKEI
 
Audio tagging system using densely connected convolutional networks (DCASE201...
Hyun-gui Lim
 

Similar to Xmeasures - Accuracy evaluation of overlapping and multi-resolution clustering algorithms on large datasets (slides) (20)

PDF
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
IJECEIAES
 
PDF
MediaEval 2016 - UPMC at MediaEval2016 Retrieving Diverse Social Images Task
multimediaeval
 
PDF
DMTM Lecture 15 Clustering evaluation
Pier Luca Lanzi
 
PDF
DAOC: Stable Clustering of Large Networks
Artem Lutov
 
PDF
AINL 2016: Muravyov
Lidia Pivovarova
 
PDF
A Combined Approach for Feature Subset Selection and Size Reduction for High ...
IJERA Editor
 
PDF
Review helpfulness assessment_aug15
Kristien Verreydt
 
PDF
Feature Subset Selection for High Dimensional Data using Clustering Techniques
IRJET Journal
 
PDF
Clustering.pdf
nadimhossain24
 
PPTX
Common evaluation measures in NLP and IR
Rushdi Shams
 
PDF
DAOR - Bridging the Gap between Community and Node Representations: Graph Emb...
Artem Lutov
 
PPTX
03-classificationTrees03-classificationTrees.pptx
DavidClement34
 
PPTX
Cluster Analysis.pptx
AdityaRajput317826
 
PPT
Chap8 basic cluster_analysis
guru_prasadg
 
PPT
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Salah Amean
 
PDF
Math Senior Project Digital- Ahnaf Khan
M. Ahnaf Khan
 
PDF
Clustering
Vidhya Chandrasekaran
 
PDF
clusteranalysis_simplexrelated to ai.pdf
jitomig557
 
PPTX
Clustering on DSS
Enaam Alotaibi
 
PPTX
Fast Single-pass K-means Clusterting at Oxford
MapR Technologies
 
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
IJECEIAES
 
MediaEval 2016 - UPMC at MediaEval2016 Retrieving Diverse Social Images Task
multimediaeval
 
DMTM Lecture 15 Clustering evaluation
Pier Luca Lanzi
 
DAOC: Stable Clustering of Large Networks
Artem Lutov
 
AINL 2016: Muravyov
Lidia Pivovarova
 
A Combined Approach for Feature Subset Selection and Size Reduction for High ...
IJERA Editor
 
Review helpfulness assessment_aug15
Kristien Verreydt
 
Feature Subset Selection for High Dimensional Data using Clustering Techniques
IRJET Journal
 
Clustering.pdf
nadimhossain24
 
Common evaluation measures in NLP and IR
Rushdi Shams
 
DAOR - Bridging the Gap between Community and Node Representations: Graph Emb...
Artem Lutov
 
03-classificationTrees03-classificationTrees.pptx
DavidClement34
 
Cluster Analysis.pptx
AdityaRajput317826
 
Chap8 basic cluster_analysis
guru_prasadg
 
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Salah Amean
 
Math Senior Project Digital- Ahnaf Khan
M. Ahnaf Khan
 
clusteranalysis_simplexrelated to ai.pdf
jitomig557
 
Clustering on DSS
Enaam Alotaibi
 
Fast Single-pass K-means Clusterting at Oxford
MapR Technologies
 
Ad

Recently uploaded (20)

PPTX
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
PDF
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
PDF
T2_01 Apuntes La Materia.pdfxxxxxxxxxxxxxxxxxxxxxxxxxxxxxskksk
mathiasdasilvabarcia
 
DOC
MATRIX_AMAN IRAWAN_20227479046.docbbbnnb
vanitafiani1
 
PPTX
Hadoop_EcoSystem slide by CIDAC India.pptx
migbaruget
 
PPTX
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
PDF
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
PDF
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
PPTX
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
PDF
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
PPTX
TSM_08_0811111111111111111111111111111111111111111111111
csomonasteriomoscow
 
PPTX
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
PDF
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
PDF
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
PPTX
Usage of Power BI for Pharmaceutical Data analysis.pptx
Anisha Herala
 
PDF
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
PPTX
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
PPTX
This PowerPoint presentation titled "Data Visualization: Turning Data into In...
HemaDivyaKantamaneni
 
PPTX
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 
PDF
Choosing the Right Database for Indexing.pdf
Tamanna
 
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
T2_01 Apuntes La Materia.pdfxxxxxxxxxxxxxxxxxxxxxxxxxxxxxskksk
mathiasdasilvabarcia
 
MATRIX_AMAN IRAWAN_20227479046.docbbbnnb
vanitafiani1
 
Hadoop_EcoSystem slide by CIDAC India.pptx
migbaruget
 
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
TSM_08_0811111111111111111111111111111111111111111111111
csomonasteriomoscow
 
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
Usage of Power BI for Pharmaceutical Data analysis.pptx
Anisha Herala
 
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
This PowerPoint presentation titled "Data Visualization: Turning Data into In...
HemaDivyaKantamaneni
 
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 
Choosing the Right Database for Indexing.pdf
Tamanna
 
Ad

Xmeasures - Accuracy evaluation of overlapping and multi-resolution clustering algorithms on large datasets (slides)

  • 1. Accuracy Evaluation of Overlapping and Multi-resolution Clustering Algorithms on Large Datasets IEEE BigComp 2019 Artem Lutov, Mourad Khayati and Philippe Cudré-Mauroux eXascale Infolab, University of Fribourg, Switzerland https://github.com/eXascaleInfolab/xmeasures https://bit.ly/xmeasures-slides
  • 2. Contributions ● Extension of the existent accuracy metrics to increasetheirdiscriminativepower ● Optimization of the accuracy metrics calculation to speedup the evaluation process ● Recommendations for the applicability of the accuracy metrics considering interpretability of their values 2
  • 3. Informal Objective Generalized Evaluation of the ClusteringQuality Generalized: for any (large and complex) clustering results (datasets) Complex clustering results: multiplemembership of the elements Quality: the extrinsic quality (accuracy) quantifiers the expectedresults (comparing to the ground-truth) unlike the intrinsic quality quantifying some statistical properties of the formed clusters 3
  • 4. Complex Clustering Results(Multiple Membership) 44 Racing cars Overlapping Clusters Clusters on Various Resolutions Blue cars Jeeps Cars Racing cars Bikes Racing & blue cars Bikes
  • 5. Matching the clusterings (unordered sets of elements) even with the elements having a single membership may yield multiplebestmatches: => Strictclusterslabeling is not always possible and undesirable. Many dedicated accuracy metrics are designed but few of them are applicable for the elements with multiplemembership. Accuracy Evaluation for Clusterings 5 Produced Ground-truth Dark or Cyan? Yellow
  • 6. Our Requirements for the Accuracy Metrics ● Applicable for the elements having multiplemembership ● Applicable for Large Datasets: ideally O(N), runtime up to O(N2 ) Families with the accuracy metrics satisfying our requirements: ● Pair Counting Based Metrics: OmegaIndex [Collins,1988] ● Cluster Matching Based Metrics: AverageF1score [Yang,2013] ● Information Theory Based Metrics: GeneralizedNMI [Esquivel,2012] Problem: accuracy values interpretability and the metric selection. 6
  • 7. Interpretability of the Accuracy Values The four formalconstraints identify aspects of the clusteringquality being captured by different accuracy metrics: ● Homogeneity ● Completeness ● Rag Bag ● Size vs Quality (Micro Weighting) 7 Amigó et al., 2009 Rosenberg et al., 2007
  • 8. Homogeneity 8 Clusters should not mix elements belonging to different categories. Low High* Figures are taken from Amigó et al., 2009
  • 9. Completeness 9 Elements belonging to the same category should be clustered together. Low High* Figures are taken from Amigó et al., 2009
  • 10. Rag Bag 10 Elements with low relevance to the categories (e.g., noise) should be preferably assigned to the less homogeneous clusters (macro-scale, low-resolution, coarse-grained or top-level clusters in a hierarchy). Low High
  • 11. Size vs Quality (Micro Weighting) 11 Low High A small assignment error in a large cluster is preferable to a large number of similar errors in small clusters.
  • 12. Omega Index (Fuzzy ARI) [Collins,1988] Omega Index (𝛀) counts the number of pairs of elements occurring in exactly the same number of clusters as in the number of categories and adjusted to the expected number of such pairs: 12 , , C’ - ground-truth (categories) C - produced cls.
  • 13. Omega Index Counterexample A counter-example for the Omega Index (highly overlapping clusters): 13 Ground-truth C1’: 1 2 3 C2’: 2 3 4 C3’: 3 4 1 C4’: 4 1 2 Low C1: 1 2 C2: 3 4 High C1: 1 2 C2: 2 3 C3: 3 4 C4: 4 1 Omega Index 0 0
  • 14. Soft Omega Index Soft Omega Index take into account pairs present in different number of clusters by normalizing smaller number of occurrences of each pair of elements in all clusters of one clustering by the larger number of occurrences in another clustering: 14 ,
  • 15. Omega Index vs Soft Omega Index A counter-example for the Omega Index (highly overlapping clusters): 15 Ground-truth C1’: 1 2 3 C2’: 2 3 4 C3’: 3 4 1 C4’: 4 1 2 Low C1: 1 2 C2: 3 4 High C1: 1 2 C2: 2 3 C3: 3 4 C4: 4 1 Low High Omega 0 0 Soft Omega 0 0.33
  • 16. Average F1 Score [Yang,2013] F1a is defined as the average of the weighted F1 scores of a) the best matching ground-truth clusters to the formed clusters and b) the best matching formed clusters to the ground-truth clusters: 16 , F1 - F1-measure [Rijsbergen, 1974]
  • 17. Mean F1 Scores: F1h F1h uses harmonic instead of the arithm. mean to address F1a ≳ 0.5 for the clusters produced from all combinations of the nodes (F1C‘,C = 1 since for each category there exists the exactly matching cluster, F1C,C’ →0 since majority of the clusters have low similarity to the categories): 17 , for the contribution m of the nodes:
  • 18. F1p is the harmonic mean of the average over each clustering of the best local probabilities (f1 ➞ pprob) for each cluster: Mean F1 Scores: F1p 18
  • 19. Purpose: O(N(|C’| + |C|)) ➞ O(N) Cluster mbs # Member nodes, const cont #Members contrib, const counter # Contribs counter Counter orig # Originating cluster ctr # Raw counter, <= mbs C Indexing Technique for Mean F1 Score 19 .. ... a for a in g2.mbs: for c in cls(C.a): cc = c.counter; if cc.orig != g2: cc.ctr=0; cc.orig=g2 cc.ctr += 1 / |C.a| if ovp else 1 fmatch(cc.ctr, c.cont, g2.cont) g2 c1 c3 C’
  • 20. SNAP DBLP (Nodes: 317,080 Edges: 1,049,866 Clusters: 13,477) ground-truth vs clustering by the Louvain. Evaluation on Intel Xeon E5-2620 (32 logical CPUs) @ 2.10 GHz, apps compiled using GCC 5.4 with -O3 flag. Xmeasures MF1 vs ParallelComMetric F1-Measure 20
  • 21. Xmeasures MF1 vs ParallelComMetric F1-Measure Execution Time (sec) 21 Dataset Eval App xmeasures @ 1 CPU pcomet @ 32 CPUs DBLP (317K n, 13K c) 0.50 (90x speedup) 44.44 Youtube (1.13M, 16K) 0.53 (450x) 244.29 Amazon(335K, 272K) 1.84 (450x) 823.06 LiveJrn. (64M, 464K) 7.65 (> 4,100x) 31,608
  • 22. Generalized Normalized Mutual Information (NMI) NMI is Mutual Information I(C’:C) normalized by the max or mean value of the unconditional entropy H of the clusterings C’, C: 22 , , GNMI [Esquivel,2012] uses a stochastic process to compute MI.
  • 23. Generalized NMI Optimizations ● Dynamically evaluate the maximal number of stochastic events as: ● Apply weighed adaptive sampling instead of the fully random sampling, where only the nodes co-occurring in the clusters are sampled and weighed to discount the contribution (importance) of frequent nodes as 23
  • 24. Performance of GNMI Optimized vs Original Accuracy Also, GNMI optimized orders of magnitude faster than GNMIorig . 24
  • 25. Formal Constraints Satisfaction * - empirically satisfies + - satisfiers 25 *
  • 26. (Soft) 𝛀 MF1 GNMI Metrics Applicability 26 O(N2 ), performs purely for the multi-resolution clusterings. Evaluates the best-matching clusters only (unfair advantage for the larger clusters). Biased to the number of clusters, non-deterministic results, the convergence is not guaranteed in the stochastic implementation. values are not affected by the number of clusters. O(N), F1p satisfiers more formal constraints than others. Highly parallelized, evaluates full matches, well-grounded theoretically.
  • 29. Average F1 Score Corrected F1a can be normalized by the total or max number of clusters to be indicative for the large number of clusters: But the indicativity is lost for the small number of clusters: for |C’| = 2 ground-truth clusters and intentionally formed single cluster |C| = 1, F 1ax (C’, C) ≳ 1/3. 29