How well does your Instance Matching system perform? Experimental evaluation with LANCE

How well does your Instance Matching
system perform?
Experimental evaluation with LANCE
Tzanina Saveta, Evangelia Daskalaki, Giorgos Flouris,
Irini Fundulaki
Institute of Computer Science – FORTH, Greece
Axel-Cyrille Ngonga Ngomo
IFI/AKSW, University of Leipzig, Germany
10/31/16 ISWC 2016: How well does your Instance Matching system perform? Experimental evaluation with LANCE 1

Why Instance Matching?
ISWC 2016: How well does your Instance Matching system perform? Experimental evaluation with LANCE 2
*Adapted from Suchanek & Weikum tutorial@SIGMOD 2013
Diﬀerent sources
contain diﬀerent
descriptions of the
same real world
entity

Instance Matching for Linked Data
Set of RDF triples
constitute an RDF
graph
Sparse Data
Rich semantics
expressed in terms
of ontologies
Large number
of sources to
integrate Value, Structure
and Semantics
Heterogeneities
*Adapted from Suchanek & Weikum tutorial@SIGMOD 2013

Benchmarking
Instance matching has led to the development of a number
of matching techniques and tools
•  How to compare those?
•  How to assess their performance (eﬃciency and
eﬀectiveness)?
•  How to “push” systems into becoming better?
•  Benchmark your systems!

Instance Matching Benchmark Components
•  Datasets
–  Source and the target datasets that will be matched together to
find the entities that refer to the same real world object
•  Ground truth / Gold standard / Reference alignment
–  The “correct answer sheet” used to judge the completeness and
soundness of the results produced by the SUT
•  Organized into test cases each addressing different kind of
instance matching requirements
•  Metrics
–  The performance metric(s) that determine the systems’
efficiency and effectiveness


LANCE
•  A novel instance matching benchmark generator
•  Domain-independent
•  Highly conﬁgurable and scalable
•  Standard value-based and structure-based test cases
•  Advanced semantics-aware test cases considering OWL2
expressive constructs
•  Rich weighted gold standard
•  Additional metrics: similarity score metric

LANCE Architecture
Source
Data
Target
Data
Weighted
Gold Standard
Resource
Transformation
Module
RESCAL
[NT12]
MATCHER
SAMPLER
Weight Computation Module
Test Case
Generation
Parameters
RDF
Repository
Data
Ingestion
Module
Initialization
Module
Resource
Generator
Test Case Generator
SPARQL
Queries
(Schema
Stats)
SPARQL
Queries
(IR)
Matched Instances
Source
Data

Test Cases
Test cases are built using a variety of transformations
•  Value-based test cases
–  Transformations of values of data type properties
•  Structure-based test cases
–  Transformations of structure of object and data type properties
•  Semantics-aware test cases
–  Transformations at the instance level considering the schema
•  Simple and Complex combination of the three ﬁrst categories

LANCE Performance Metrics
•  Average similarity score: average difficulty of the matched instances
–  Benchmark with high average similarity score: matched instances are
easier to find
•  Standard deviation: spread of similarity scores for the matched instances
–  Benchmark with high standard deviation:
•  scores are spread out from the average
•  more heterogeneity of matched instances
10/31/16 HOBBIT Plenary 2
Obtain a more fine-grained understanding of the IM system’s
performance by comparing the average standard deviation and
similarity score of the system and benchmark

Experiments
•  Eﬃciency and eﬀectiveness of IM systems using LANCE benchmarks
–  Systems:
•  LogMap Version 2.4 [JG11] (MoRe Reasoner [RG13])
•  OtO [DP12]
•  LIMES (EAGLE IM algorithm [NL12])
–  Datasets
•  LDBC’s SPIMBENCH Generator (Semantic Publishing
Benchmark)
•  UOBM
–  Matching Task
•  All 5 categories introduced previously
•  All instances were transformed

10

SPIMBENCH: Standard Metrics
11
•  LogMap
–  Respond well in the value-based test cases
–  Reduced performance when also semantics-aware test
cases were applied

SPIMBENCH: Standard Metrics
12
•  OtO and EAGLE
–  Give good results regarding the value-based
transformations
–  Reduced performance in the remaining categories
•  EAGLE is non-deterministic and uses unsupervised learning

UOBM: Standard Metrics
•  LogMap
1. Does not perform well to any of the categories
2. Performance not aﬀected by the dataset size
•  OtO
1. Performs better
2. Reduced
performance
when increasing
dataset size
13

SPIMBENCH: Additional Metrics
Distribution of similarity scores for LANCE and True Positive
matches from IM systems for semantics-aware test cases in
the case of the 10K triples dataset.

•  LogMap can address
diﬃcult test cases

•  EAGLE & OtO can address
mostly value-based test cases

1
10
100
0.7 0.72 0.74 0.76 0.78 0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1
log(# of mappings)
Similarity Scores
OtO EAGLE LogMap LANCE
14
Standard Devia8on

UOBM: Additional Metrics
Distribution of similarity scores for LANCE and True Positive
matches from IM systems for structure-based test cases in
the case of the 10K triples dataset.

•  LogMap cannot
address well the
change of URIs in the Instances
ISWC 2016: How well does your Instance Matching system perform? Experimental evaluation with LANCE 15
1
10
100
0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74 0.76 0.78 0.8 0.82 0.84 0.86 0.88 0.9
log(# of mappings)
Similarity
OtO LogMap LANCE
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
OtO LogMap LANCE

Lessons Learned

•  Different type of transformations affect IM system’s
performance
•  The characteristics of source datasets affect the behavior of
IM systems

Questions?

Acknowledgments
This project has received funding from the European Union’s
Horizon 2020 research and innovation programme under grant
agreement No 688227.

References
[JG11] E. Jimenez-Ruiz and B. C. Grau. Logmap: Logic-based and scalable ontology matching.
In ISWC, 2011.
[RG13] A. A. Romero, B.C. Grau, et al. MORe: a Modular OWL Reasoner for Ontology
Classification. In ORE, pages 61-67, 2013.
[DP12] E. Daskalaki and D. Plexousakis. OtO Matching System: A Multi-strategy Approach to
Instance Matching. In CAiSE, 2012.
[NL12] A.-C. Ngonga Ngomo and K. Lyko. EAGLE: Efficient Active Learning of Link
Specifications using Genetic Programming. In ESWC, 2012.
19

How well does your Instance Matching system perform? Experimental evaluation with LANCE

More Related Content

What's hot (20)

Viewers also liked (8)

More from Holistic Benchmarking of Big Linked Data (20)

Recently uploaded (20)

How well does your Instance Matching system perform? Experimental evaluation with LANCE