Navigating and Exploring RDF Data using Formal
Concept Analysis
Mehwish Alam Amedeo Napoli
LORIA (CNRS – Inria Nancy Grand Est – Université de Lorraine), BP 239, Vandoeuvre les
Nancy, F-54506, France,
{firstname.lastname}@loria.fr
September 11th 2015
Linked Data for Knowledge Discovery - ECML/PKDD - Porto, Protugal.
1 / 31
Motivation
A Scenario
A patient takes some drug which is a cardiovascular agent and which
causes some side effect i.e., allergic reaction.
The domain expert (a medical doctor) should look for drugs which are
cardiovascular agents but do not cause any side effect or allergic
reactions (for this patient).
The domain expert chooses some datasets, resources and knowledge
sources to be searched, e.g. Drugbank (drug with their categories),
SIDER (for side effect), Mesh, MedDRA. . .
The objective of this research work is to provide a platform supporting
the querying of such resources.
Requirements:
Drug
Information
(DrugBank)
Side Effect
Information
(SIDER)
2 / 31
Motivation
Problem Statement
3 / 31
Motivation
RDF Graph for Drugbank
Drug
DB01193 P10635 Fatty acids
acebutolol Cytochrome P450
targets
rdf:type
rdfs:label
hasPathway
rdfs:subClassOf
4 / 31
Motivation
RDF Triples as Entity and Description
tid Subject Predicate Object Provenance
t1 s1 p1 o11 dataset1
t2 s1 p1 o15 dataset1
t3 s1 p2 o26 dataset2
t4 s1 rdf:type Drug dataset2
t5 o11 rdf:type C1 dataset3
t6 C1 rdfs:subClassOf C3 dataset3
t7 o26 rdf:type D6 dataset4
t8 D6 rdfs:subClassOf D8 dataset4
...
...
...
...
...
Table : Sample RDF triple store from several resources.
Entities S Descriptions Ds
s1 {(p1:{C1, C5, C7}), (p2:{D6})}
s2 {(p1:{C1, C13, C11}), (p2:{D7})}
s3 {(p1:{C5}), (p2:{D7})}
s4 {(p1:{C13, C11, C7}), (p2:{D1})}
s5 {(p1:{C1, C7}), (p2:{D6})}
Table : RDF Triples as entities S and descriptions Ds.
5 / 31
FCA: formal context and binary attributes
A formal context is a triple pG, M, Iq where G is a set of objects, M a set of
attributes, and I a binary relation such as pg, mq P I means that “object g
owns attribute m”.
A Galois connection characterizes formal concepts:
A1
“ tm P M | @g P A Ď G : pg, mq P Iu
B1
“ tg P G | @m P B Ď M : pg, mq P Iu
pA, Bq is a formal concept with extent A “ B1
and intent B “ A1
, e.g.
ptg3, g4, g5u, tm2, m3uq.
Concepts are pairs of maximal sets of objects with corre-
sponding maximal sets of attributes.
m1 m2 m3
g1 ˆ ˆ
g2 ˆ ˆ
g3 ˆ ˆ
g4 ˆ ˆ
g5 ˆ ˆ ˆ
FCA: concept lattice
pA1, B1q ď pA2, B2q ô A1 Ď A2 pô B2 Ď B1q
ptg1, g5u, tm1, m3uq ď ptg1, g2, g5u, tm1uq
The concept lattice is based on two dual partial orderings:
Larger extents and smaller intents are in the “higher levels” of the
concept lattice.
Smaller extents and larger intents are in the “lower levels” of the
concept lattice.
Reduced labeling:
intents are “inherited” from top to bottom (top-down),
and extents are “inherited” from bottom to top (bottom-up).
FCA and Pattern Structures
Conceptual scaling
The formal context is the basic data type of Formal Concept Analysis.
However data are often given in form of a many-valued context.
Many-valued contexts are translated to one-valued context via
conceptual scaling.
But this is not automatic and some arbitrary choices have to be made.
Examples of scalings:
Nominal: K “ pN, N, “q
Ordinal: K “ pN, N, ďq
Interordinal: K “ pN, N, ď Y ěq
8 / 31
A numerical example
G / M m1 m2 m3
g1 1 3 4
g2 2 2 3
g3 4 1 1
g4 3 2 1
Nominal Scaling:
G / M m1=1 m1=2 m1=4 m2=1 m2=2 m2=3 m3=1 m3=3 m3=4
g1 x x x
g2 x x x
g3 x x x
g4 x x x
Interordinal Scaling:
G / M m1.lt.1 m1.gt.1 m1.lt.2 m1.gt.2 m1.lt.3 m1.gt.3 m1.lt.4 m1.gt.4 m2.lt.1 m2.gt.1
g1 x x x x x x
g2 x x x x x x
g3 x x x x x x x
g4 x x x x x x
Computing similarity between descriptions
Intersection considered as a similarity operator:
X behaves like a similarity operator: tm1, m2u X tm1, m3u “ tm1u
m1 m2 m3
g1 ˆ ˆ
g2 ˆ ˆ
g3 ˆ ˆ
g4 ˆ ˆ
g5 ˆ ˆ ˆ
X induces a partial ordering relation Ď as follows:
S1 X S2 “ S1 ðñ S1 Ď S2
tm1u X tm1, m2u “ tm1u ðñ tm1u Ď tm1, m2u
X has the properties of a meet [ in a semi lattice, i.e. a
commutative, associative and idempotent operation:
c [ d “ c ðñ c Ď d
The definition of a Pattern Structure
A pattern structure pG, pD, [q, δq is composed of:
G a set of objects,
pD, [q a semi-lattice of descriptions or patterns,
δ a mapping such as δpgq P D describes object g.
The Galois connection for pG, pD, [q, δq is defined as:
The maximal description representing the similarity of a set of objects:
A˝
“ [gPAδpgq for A Ď G
The maximal set of objects sharing a given description:
d˝
“ tg P G|d Ď δpgqu for d P pD, [q
Standard FCA as a Pattern Structure pG, pD, [q, δq
Considering a standard formal context pG, M, Iq:
G is the set of objects,
pD, [q corresponds to ℘pMq where M is the set of attributes.
δpgq corresponds to the description of g in terms of attributes.
The Galois connection:
m1 m2 m3
g1 ˆ ˆ
g2 ˆ ˆ
g3 ˆ ˆ
g4 ˆ ˆ
g5 ˆ ˆ ˆ
A˝ “ [gPAδpgq for A Ď G
tg1, g2u
1
“ g
1
1 X g
1
2 “ tm1, m2u X tm1, m3u “ tm1u
d˝ “ tg P G|d Ď δpgqu for d P pD, [q
tm1u
1
“ tgi P G|tm1u Ď g
1
i u “ tg1, g2, g5u
From FCA to Pattern Structures
A formal context pG, M, Iq is based on a
set of objects G, a set of attributes M,
and a binary relation I Ď G ˆ M.
Two derivation operators are defined as
follows, @A Ď G, B Ď M:
A1
“ tm P M|@g P A, pg, mq P Iu
B1
“ tg P G|@m P B, pg, mq P Iu
A formal concept pA, Bq verifies A1 “ B
and A “ B1.
Formal concepts are partially ordered
w.r.t. inclusion of extents (or dually of
intents):
pA1, B1q ď pA2, B2q iff A1 Ď A2
A pattern structure pG, pD, [q, δq is
based on a set of objects G, a meet
semi-lattice of object descriptions
pD, [q, and a mapping δ : G ÝÑ D
which associates a description to each
object.
Two derivation operators are defined as
follows, @A Ď G, d P pD, [q:
A˝
“ [gPAδpgq
d˝
“ tg P G|d Ď δpgq
A formal concept pA, dq verifies A˝ “ d
and A “ d˝
Pattern concepts are partially ordered
w.r.t. inclusion of extents (or dually
inclusion of intents):
pA1, d1q ď pA2, d2q iff A1 Ď A2
Interval Pattern Structure
Let D be a set of intervals with integer bounds (for simplicity),
let [ be a meet operator defined on D as the convex hull of intervals:
ra1, b1s [ ra2, b2s “ rminpa1, a2q, maxpb1, b2qs
r4, 5s [ r5, 5s “ r4, 5s
ra1, b1s Ď ra2, b2s ðñ ra2, b2s Ď ra1, b1s
r4, 5s Ď r5, 5s ðñ r5, 5s Ď r4, 5s
Interval Pattern Structure
m1 m2 m3
g1 5 7 6
g2 6 8 4
g3 4 8 5
g4 4 9 8
g5 5 8 5
tg1, g2u˝
“ [gPtg1,g2uδpgq
“ x5, 7, 6y [ x6, 8, 4y
“ xr5, 6s, r7, 8s, r4, 6sy
xr5, 6s, r7, 8s, r4, 6sy˝
“ tg P G|xr5, 6s, r7, 8s, r4, 6sy Ď δpgqu
“ tg1, g2, g5u
ptg1, g2, g5u, xr5, 6s, r7, 8s, r4, 6syq is a pattern concept
FCA and Pattern Structures
Interval pattern concept lattice
ptg1, g2, g5u, xr5, 6s, r7, 8s, r4, 6syq is a pattern concept
Highest concepts: largest extents and smallest intents (but the largest
intervals),
Lowest concepts: smallest extents and largest intents (but the smallest
intervals),
Problem: efficient pattern mining.
16 / 31
Interval pattern concept lattice
RDF Pattern Structures
Back to RDF Triple Descriptions
tid Subject Predicate Object Provenance
t1 s1 p1 o11 dataset1
t2 s1 p1 o15 dataset1
t3 s1 p2 o26 dataset2
t4 s1 rdf:type Drug dataset2
t5 o11 rdf:type C1 dataset3
t6 C1 rdfs:subClassOf C3 dataset3
t7 o26 rdf:type D6 dataset4
t8 D6 rdfs:subClassOf D8 dataset4
...
...
...
...
...
Table : Sample RDF triple store from several resources.
Entities S Descriptions Ds
s1 {(p1:{C1, C5, C7}), (p2:{D6})}
s2 {(p1:{C1, C13, C11}), (p2:{D7})}
s3 {(p1:{C5}), (p2:{D7})}
s4 {(p1:{C13, C11, C7}), (p2:{D1})}
s5 {(p1:{C1, C7}), (p2:{D6})}
Table : RDF Triples as entities S and descriptions Ds.
18 / 31
RDF Pattern Structures
RDF Schema
C4
C3
C2
C1
C6
C5
C10
C9
C8
C7
C12
C11 C13
Figure : RDF Schema for p1.
D5
D4
D3
D2
D1
D9
D8
D6 D7
Figure : RDF Schema for p2.
19 / 31
RDF Pattern Structures
Similarity Operation between Two Classes
Definition (Least Common Subsumer)
Given a partially ordered set pS, ďq, a least common subsumer E of two
classes C and D (lcs(C,D) for short) in a partially ordered set is a class
such that C ď E and D ď E and E is least i.e., if there is a class E1 such
that C ď E1 and D ď E1 then E ď E1.
C4
C3
C2
C1
C6
C5
C10
C9
C8
C7
C12
C11 C13
20 / 31
RDF Pattern Structures
Similarity Operation between Two Classes
Definition (Least Common Subsumer)
Given a partially ordered set pS, ďq, a least common subsumer E of two
classes C and D (lcs(C,D) for short) in a partially ordered set is a class
such that C ď E and D ď E and E is least i.e., if there is a class E1 such
that C ď E1 and D ď E1 then E ď E1.
C4
C3
C2
C1
C6
C5
C10
C9
C8
C7
C12
C11 C13
20 / 31
RDF Pattern Structures
Similarity Operation between Two Set of classes
δps1q “ xp1 : tC1, C5, C7uy
δps2q “ xp1 : tC1, C13, C11uy
C4
C3
C2
C1
C6
C5
C10
C9
C8
C7
C12
C11 C13
21 / 31
RDF Pattern Structures
Similarity Operation between Two Set of classes
δps1q “ xp1 : tC1, C5, C7uy
δps2q “ xp1 : tC1, C13, C11uy
C4
C3
C2
C1
C6
C5
C10
C9
C8
C7
C12
C11 C13
C1 , C3 and C4 are comparable i.e., C4 ě C3 ě C1 then the most specific
element is considered i.e., C1.
22 / 31
RDF Pattern Structures
Similarity Operation between Two Set of classes
δps1q “ xp1 : tC1, C5, C7uy
δps2q “ xp1 : tC1, C13, C11uy
C4
C3
C2
C1
C6
C5
C10
C9
C8
C7
C12
C11 C13
δps1q [ δps2q “ xp1 : tC1, C9uy
22 / 31
RDF Pattern Structures
Similarity Operation between Two Set of classes
δps1q “ xp1 : tC1, C5, C7uyxp2 : tD6uy
δps2q “ xp1 : tC1, C13, C11uyxp2 : tD7uy
C4
C3
C2
C1
C6
C5
C10
C9
C8
C7
C12
C11 C13
D5
D4
D3
D2
D1
D9
D8
D6 D7
Similarity between s1 and s2:
δps1q [ δps2q “ xp1 : tC1, C9uy, xp2 : tD8uy.
23 / 31
RDF Pattern Structures
Building a Pattern Concept Lattice
S Ds
s1 {(p1:{C1, C5, C7}), (p2:{D6})}
s2 {(p1:{C1, C13, C11}), (p2:{D7})}
s3 {(p1:{C5}), (p2:{D7})}
s4 {(p1:{C13, C11, C7}), (p2:{D1})}
s5 {(p1:{C1, C7}), (p2:{D6})}
ts1, s2ul
“
ę
sPts1,s2u
δpsq
“ δps1q [ δps2q
“ xpp1 : tC1, C9uqpp2 : tD8uqy
xpp1 : tC1, C9uqpp2 : tD8uqyl
“ ts P S|xpp1 : tC1, C9uqpp2 : tD8uqy Ď δpsqu
“ ts1, s2, s5u
Pattern Concept
xts1, s2, s5u, pp1 : tC1, C9uqpp2 : tD8uqy is a pattern concept.
24 / 31
RDF Pattern Structures
Pattern Concept Lattice
K#8
K#4 K#9
K#5 K#6 K#2 K#10 K#11
K#13
K#7 K#3 K#1 K#12
K#0
Figure : Pattern Concept lattice
K#ID Extent Intent
K#1 {s1} (p1:{C7, C1, C5}), (p2:{D6})
K#2 {s1, s2, s5} (p1:{C1, C9}), (p2:{D8})
K#3 {s2} (p1:{C1, C11, C13}), (p2:{D7})
K#4 {s1, s2, s3, s5} (p1:{C3}), (p2:{D8})
K#5 {s1, s3} (p1:{C5}), (p2:{D8})
K#6 {s2, s3} (p1:{C3}), (p2:{D7})
K#7 {s3} (p1:{C5}), (p2:{D7})
K#8 {s1, s2, s3, s4, s5} (p1:{C4}), (p2:{D5})
K#9 {s1, s2, s4, s5} (p1:{C9}), (p2:{D5})
K#10 {s1, s4, s5} (p1:{C7}), (p2:{D5})
K#11 {s2, s4} (p1:{C11, C13}), (p2:{D5})
K#12 {s4} (p1:{C7, C11, C13}), (p2:{D1})
K#13 {s1, s5} (p1:{C1, C7}), (p2:{D6})
Table : Details of Pattern Concept lattice
25 / 31
RDF Pattern Structures
Pattern Concept Lattice
K#ID Extent Intent
K#2 {s1, s2, s5} (p1:{C1, C9}), (p2:{D8})
K#4 {s1, s2, s3, s5} (p1:{C3}), (p2:{D8})
C “ pp1 : tC3uq, pp2 : tD8uq
D “ pp1 : tC1, C9uq, pp2 : tD8uq
C [ D “ pp1 : tC3uq, pp2 : tD8uq [
pp1 : tC1, C9uq, pp2 : tD8uq
C [ D “ pp1 : tlcspC3, C1q, lcspC3, C9quq,
pp2 : tlcspD8, D8quq
C [ D “ pp1 : tC3, C4uq, pp2 : tD8uq
C [ D “ pp1 : tC3uq, pp2 : tD8uq “ C
C [ D “ C ô C Ď D
@c1 P C, Dd1 P D, d1 ď c1
26 / 31
Navigating the Pattern Concept Lattice
Navigating Concept Lattice
Search for Cardiovascular Agents causing Allergic Conditions.
C3 “ Cardiac Disorders, D8 “ Cardiovascular Agents, C9 “ Allergic
Conditions
K#8
K#4 K#9
K#5 K#6 K#2 K#10 K#11
K#13
K#7 K#3 K#1 K#12
K#0
K#ID Extent Intent
K#1 {s1} (p1:{C7, C1, C5}), (p2:{D6})
K#2 {s1, s2, s5} (p1:{C1, C9}), (p2:{D8})
K#3 {s2} (p1:{C1, C11, C13}), (p2:{D7})
K#4 {s1, s2, s3, s5} (p1:{C3}), (p2:{D8})
K#5 {s1, s3} (p1:{C5}), (p2:{D8})
K#6 {s2, s3} (p1:{C3}), (p2:{D7})
K#7 {s3} (p1:{C5}), (p2:{D7})
K#8 {s1, s2, s3, s4, s5} (p1:{C4}), (p2:{D5})
K#9 {s1, s2, s4, s5} (p1:{C9}), (p2:{D5})
K#10 {s1, s4, s5} (p1:{C7}), (p2:{D5})
K#11 {s2, s4} (p1:{C11, C13}), (p2:{D5})
K#12 {s4} (p1:{C7, C11, C13}), (p2:{D1})
K#13 {s1, s5} (p1:{C1, C7}), (p2:{D6})
27 / 31
Navigating the Pattern Concept Lattice
Navigating Concept Lattice
Search for Cardiovascular Agents causing Allergic Conditions.
C3 “ Cardiac Disorders, D8 “ Cardiovascular Agents, C9 “ Allergic
Conditions, C1 “ Acute Coronary Syndrome.
K#8
K#4 K#9
K#5 K#6 K#2 K#10 K#11
K#13
K#7 K#3 K#1 K#12
K#0
K#ID Extent Intent
K#1 {s1} (p1:{C7, C1, C5}), (p2:{D6})
K#2 {s1, s2, s5} (p1:{C1, C9}), (p2:{D8})
K#3 {s2} (p1:{C1, C11, C13}), (p2:{D7})
K#4 {s1, s2, s3, s5} (p1:{C3}), (p2:{D8})
K#5 {s1, s3} (p1:{C5}), (p2:{D8})
K#6 {s2, s3} (p1:{C3}), (p2:{D7})
K#7 {s3} (p1:{C5}), (p2:{D7})
K#8 {s1, s2, s3, s4, s5} (p1:{C4}), (p2:{D5})
K#9 {s1, s2, s4, s5} (p1:{C9}), (p2:{D5})
K#10 {s1, s4, s5} (p1:{C7}), (p2:{D5})
K#11 {s2, s4} (p1:{C11, C13}), (p2:{D5})
K#12 {s4} (p1:{C7, C11, C13}), (p2:{D1})
K#13 {s1, s5} (p1:{C1, C7}), (p2:{D6})
27 / 31
Navigating the Pattern Concept Lattice
Navigating Concept Lattice
Search for Cardiovascular Agents not causing Allergic Conditions.
C3 “ Cardiac Disorders, D8 “ Cardiovascular Agents, C9 “ Allergic
Conditions, C1 “ Acute Coronary Syndrome.
K#8
K#4 K#9
K#5 K#6 K#2 K#10 K#11
K#13
K#7 K#3 K#1 K#12
K#0
K#ID Extent Intent
K#1 {s1} (p1:{C7, C1, C5}), (p2:{D6})
K#2 {s1, s2, s5} (p1:{C1, C9}), (p2:{D8})
K#3 {s2} (p1:{C1, C11, C13}), (p2:{D7})
K#4 {s1, s2, s3, s5} (p1:{C3}), (p2:{D8})
K#5 {s1, s3} (p1:{C5}), (p2:{D8})
K#6 {s2, s3} (p1:{C3}), (p2:{D7})
K#7 {s3} (p1:{C5}), (p2:{D7})
K#8 {s1, s2, s3, s4, s5} (p1:{C4}), (p2:{D5})
K#9 {s1, s2, s4, s5} (p1:{C9}), (p2:{D5})
K#10 {s1, s4, s5} (p1:{C7}), (p2:{D5})
K#11 {s2, s4} (p1:{C11, C13}), (p2:{D5})
K#12 {s4} (p1:{C7, C11, C13}), (p2:{D1})
K#13 {s1, s5} (p1:{C1, C7}), (p2:{D6})
27 / 31
Navigating the Pattern Concept Lattice
Navigating Concept Lattice
Search for Cardiovascular Agents not causing Allergic Conditions.
C3 “ Cardiac Disorders, D8 “ Cardiovascular Agents, C9 “ Allergic
Conditions, C1 “ Acute Coronary Syndrome.
K#8
K#4 K#9
K#5 K#6 K#2 K#10 K#11
K#13
K#7 K#3 K#1 K#12
K#0
K#ID Extent Intent
K#1 {s1} (p1:{C7, C1, C5}), (p2:{D6})
K#2 {s1, s2, s5} (p1:{C1, C9}), (p2:{D8})
K#3 {s2} (p1:{C1, C11, C13}), (p2:{D7})
K#4 {s1, s2, s3, s5} (p1:{C3}), (p2:{D8})
K#5 {s1, s3} (p1:{C5}), (p2:{D8})
K#6 {s2, s3} (p1:{C3}), (p2:{D7})
K#7 {s3} (p1:{C5}), (p2:{D7})
K#8 {s1, s2, s3, s4, s5} (p1:{C4}), (p2:{D5})
K#9 {s1, s2, s4, s5} (p1:{C9}), (p2:{D5})
K#10 {s1, s4, s5} (p1:{C7}), (p2:{D5})
K#11 {s2, s4} (p1:{C11, C13}), (p2:{D5})
K#12 {s4} (p1:{C7, C11, C13}), (p2:{D1})
K#13 {s1, s5} (p1:{C1, C7}), (p2:{D6})
27 / 31
Navigating the Pattern Concept Lattice
Navigating Concept Lattice
Search for Cardiovascular Agents not causing Allergic Conditions.
C3 “ Cardiac Disorders, D8 “ Cardiovascular Agents, C9 “ Allergic
Conditions, C1 “ Acute Coronary Syndrome.
K#8
K#4 K#9
K#5 K#6 K#2 K#10 K#11
K#13
K#7 K#3 K#1 K#12
K#0
K#ID Extent Intent
K#1 {s1} (p1:{C7, C1, C5}), (p2:{D6})
K#2 {s1, s2, s5} (p1:{C1, C9}), (p2:{D8})
K#3 {s2} (p1:{C1, C11, C13}), (p2:{D7})
K#4 {s1, s2, s3, s5} (p1:{C3}), (p2:{D8})
K#5 {s1, s3} (p1:{C5}), (p2:{D8})
K#6 {s2, s3} (p1:{C3}), (p2:{D7})
K#7 {s3} (p1:{C5}), (p2:{D7})
K#8 {s1, s2, s3, s4, s5} (p1:{C4}), (p2:{D5})
K#9 {s1, s2, s4, s5} (p1:{C9}), (p2:{D5})
K#10 {s1, s4, s5} (p1:{C7}), (p2:{D5})
K#11 {s2, s4} (p1:{C11, C13}), (p2:{D5})
K#12 {s4} (p1:{C7, C11, C13}), (p2:{D1})
K#13 {s1, s5} (p1:{C1, C7}), (p2:{D6})
27 / 31
Experiments and Conclusion
Experimentation
Coded in C++.
Real datasets Biomedical Data and DBLP.
Subsets of drug data were considered, i.e., Cardiovascular Agents and
Central Nervous System.
The selected datasets:
Drugbank (RDF triples)
Sider (RDF triples)
MedDRA (RDF Schema - BioPortal)
MeSH (RDF Schema - Bio2RDF)
28 / 31
Experiments and Conclusion
Experimentation
Datasets No. of Triples No. of Subjects No. of Objects Runtime
Cardiovascular Agents 31098 145 927 0-22 sec
Central Nervous System 22680 105 1050 0-25 sec
Table : Statistics of two datasets and index lattice.
(a) Index Size for Drugbank Dataset
Index Size Reduction
Hiding non-interesting
parts.
Removing general classes
from RDF Schema.
Support.
Stability.
29 / 31
Experiments and Conclusion
Conclusion and Future Work
Obtained index provide the following benefits:
Classification of RDF triples with respect to RDF Schema.
Simultaneous access over RDF triples as well as RDF schema.
One platform for navigating and querying several resources.
Heterogeneous data i.e., when the reference schema is not present such
as proteins can be easily dealt with.
Results are visualized using RV-Xplorer1
.
Future Work:
Provide a better formalization of the framework.
Provide more realistic usecases.
Define a new similarity measure to deal with graph-based structures.
1
A new tool developed by our team for allowing lattice interaction, (Alam et. al.
International Conference on Concept Lattice and their Applications, 2015.)
30 / 31
Navigating and Exploring RDF Data using Formal Concept Analysis

More Related Content

PDF
Interactive Knowledge Discovery over Web of Data.
PDF
Framester: A Wide Coverage Linguistic Linked Data Hub
PDF
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey Gusev
PDF
Learning Commonalities in RDF
PDF
Automated building of taxonomies for search engines
PDF
RDataMining slides-text-mining-with-r
PDF
Text Mining Using R
PPT
Сергей Кольцов —НИУ ВШЭ —ICBDA 2015
Interactive Knowledge Discovery over Web of Data.
Framester: A Wide Coverage Linguistic Linked Data Hub
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey Gusev
Learning Commonalities in RDF
Automated building of taxonomies for search engines
RDataMining slides-text-mining-with-r
Text Mining Using R
Сергей Кольцов —НИУ ВШЭ —ICBDA 2015

What's hot (20)

PDF
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
PPTX
Lecture 9 - Machine Learning and Support Vector Machines (SVM)
PPTX
Classification of CNN.com Articles using a TF*IDF Metric
PDF
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
PDF
Text Mining with R
PDF
Learning to rankの評価手法
PPTX
Text analytics in Python and R with examples from Tobacco Control
PPTX
Natural Language Processing in R (rNLP)
PPTX
Duet @ TREC 2019 Deep Learning Track
PPTX
hands on: Text Mining With R
PDF
Probabilistic Retrieval
PPTX
Probabilistic information retrieval models & systems
PDF
ParlBench: a SPARQL-benchmark for electronic publishing applications.
PDF
Hybrid geo textual index structure
PPTX
5 Lessons Learned from Designing Neural Models for Information Retrieval
PDF
Text mining and social network analysis of twitter data part 1
PDF
A scalable gibbs sampler for probabilistic entity linking
PDF
Query Distributed RDF Graphs: The Effects of Partitioning Paper
PDF
TopicModels_BleiPaper_Summary.pptx
PDF
Grades nda 2018 - gremlinator demo talk - harsh thakkar
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Lecture 9 - Machine Learning and Support Vector Machines (SVM)
Classification of CNN.com Articles using a TF*IDF Metric
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Text Mining with R
Learning to rankの評価手法
Text analytics in Python and R with examples from Tobacco Control
Natural Language Processing in R (rNLP)
Duet @ TREC 2019 Deep Learning Track
hands on: Text Mining With R
Probabilistic Retrieval
Probabilistic information retrieval models & systems
ParlBench: a SPARQL-benchmark for electronic publishing applications.
Hybrid geo textual index structure
5 Lessons Learned from Designing Neural Models for Information Retrieval
Text mining and social network analysis of twitter data part 1
A scalable gibbs sampler for probabilistic entity linking
Query Distributed RDF Graphs: The Effects of Partitioning Paper
TopicModels_BleiPaper_Summary.pptx
Grades nda 2018 - gremlinator demo talk - harsh thakkar
Ad

Similar to Navigating and Exploring RDF Data using Formal Concept Analysis (20)

PDF
Talk at Seminari de Teoria de Nombres de Barcelona 2017
PPTX
Inductive Triple Graphs: A purely functional approach to represent RDF
PPT
A Distributed Tableau Algorithm for Package-based Description Logics
PDF
Sara el hassad
PPTX
Iwsm2014 an analogy-based approach to estimation of software development ef...
PDF
PPTX
Compact Representation of Large RDF Data Sets for Publishing and Exchange
PDF
Personalised Search for the Social Semantic Web
PPTX
Galois field
PPT
AlgorithmAnalysis2.ppt
PPTX
An optimal and progressive algorithm for skyline queries slide
PPT
Artificial Intelligence
DOCX
Planted Clique Research Paper
PDF
Supplementary material for my following paper: Infinite Latent Process Decomp...
PDF
Minimum Size of Generating Set for Transitive p - Group G of Degree 3 p .
PDF
α Nearness ant colony system with adaptive strategies for the traveling sales...
PPT
lecture 1
PPTX
Unit-1 (Mathematical Notations) Theory of Computation PPT
Talk at Seminari de Teoria de Nombres de Barcelona 2017
Inductive Triple Graphs: A purely functional approach to represent RDF
A Distributed Tableau Algorithm for Package-based Description Logics
Sara el hassad
Iwsm2014 an analogy-based approach to estimation of software development ef...
Compact Representation of Large RDF Data Sets for Publishing and Exchange
Personalised Search for the Social Semantic Web
Galois field
AlgorithmAnalysis2.ppt
An optimal and progressive algorithm for skyline queries slide
Artificial Intelligence
Planted Clique Research Paper
Supplementary material for my following paper: Infinite Latent Process Decomp...
Minimum Size of Generating Set for Transitive p - Group G of Degree 3 p .
α Nearness ant colony system with adaptive strategies for the traveling sales...
lecture 1
Unit-1 (Mathematical Notations) Theory of Computation PPT
Ad

Recently uploaded (20)

PDF
LATAM’s Top EdTech Innovators Transforming Learning in 2025.pdf
PDF
The 10 Most Inspiring Education Leaders to Follow in 2025.pdf
PPSX
namma_kalvi_12th_botany_chapter_9_ppt.ppsx
PPTX
MALARIA - educational ppt for students..
PPTX
Single Visit Endodontics.pptx treatment in one visit
PDF
FAMILY PLANNING (preventative and social medicine pdf)
PDF
Unleashing the Potential of the Cultural and creative industries
PDF
HSE and their team are going through the hazards of the issues with learning ...
PPTX
INTRODUCTION TO PHILOSOPHY FULL SEM - COMPLETE.pptxINTRODUCTION TO PHILOSOPHY...
PPT
hsl powerpoint resource goyloveh feb 07.ppt
PPTX
Chapter-4-Rizal-Higher-Education-1-2_081545.pptx
PPTX
CHF refers to the condition wherein heart unable to pump a sufficient amount ...
PDF
English 2nd semesteNotesh biology biopsy results from the other day and I jus...
PDF
Kalaari-SaaS-Founder-Playbook-2024-Edition-.pdf
PDF
NGÂN HÀNG CÂU HỎI TÁCH CHỌN LỌC THEO CHUYÊN ĐỀ TỪ ĐỀ THI THỬ TN THPT 2025 TIẾ...
PDF
GSA-Past-Papers-2010-2024-2.pdf CSS examination
PDF
horaris de grups del curs 2025-2026 de l'institut
PDF
Teacher's Day Quiz 2025
PDF
Design and Evaluation of a Inonotus obliquus-AgNP-Maltodextrin Delivery Syste...
PDF
Jana-Ojana Finals 2025 - School Quiz by Pragya - UEMK Quiz Club
LATAM’s Top EdTech Innovators Transforming Learning in 2025.pdf
The 10 Most Inspiring Education Leaders to Follow in 2025.pdf
namma_kalvi_12th_botany_chapter_9_ppt.ppsx
MALARIA - educational ppt for students..
Single Visit Endodontics.pptx treatment in one visit
FAMILY PLANNING (preventative and social medicine pdf)
Unleashing the Potential of the Cultural and creative industries
HSE and their team are going through the hazards of the issues with learning ...
INTRODUCTION TO PHILOSOPHY FULL SEM - COMPLETE.pptxINTRODUCTION TO PHILOSOPHY...
hsl powerpoint resource goyloveh feb 07.ppt
Chapter-4-Rizal-Higher-Education-1-2_081545.pptx
CHF refers to the condition wherein heart unable to pump a sufficient amount ...
English 2nd semesteNotesh biology biopsy results from the other day and I jus...
Kalaari-SaaS-Founder-Playbook-2024-Edition-.pdf
NGÂN HÀNG CÂU HỎI TÁCH CHỌN LỌC THEO CHUYÊN ĐỀ TỪ ĐỀ THI THỬ TN THPT 2025 TIẾ...
GSA-Past-Papers-2010-2024-2.pdf CSS examination
horaris de grups del curs 2025-2026 de l'institut
Teacher's Day Quiz 2025
Design and Evaluation of a Inonotus obliquus-AgNP-Maltodextrin Delivery Syste...
Jana-Ojana Finals 2025 - School Quiz by Pragya - UEMK Quiz Club

Navigating and Exploring RDF Data using Formal Concept Analysis

  • 1. Navigating and Exploring RDF Data using Formal Concept Analysis Mehwish Alam Amedeo Napoli LORIA (CNRS – Inria Nancy Grand Est – Université de Lorraine), BP 239, Vandoeuvre les Nancy, F-54506, France, {firstname.lastname}@loria.fr September 11th 2015 Linked Data for Knowledge Discovery - ECML/PKDD - Porto, Protugal. 1 / 31
  • 2. Motivation A Scenario A patient takes some drug which is a cardiovascular agent and which causes some side effect i.e., allergic reaction. The domain expert (a medical doctor) should look for drugs which are cardiovascular agents but do not cause any side effect or allergic reactions (for this patient). The domain expert chooses some datasets, resources and knowledge sources to be searched, e.g. Drugbank (drug with their categories), SIDER (for side effect), Mesh, MedDRA. . . The objective of this research work is to provide a platform supporting the querying of such resources. Requirements: Drug Information (DrugBank) Side Effect Information (SIDER) 2 / 31
  • 4. Motivation RDF Graph for Drugbank Drug DB01193 P10635 Fatty acids acebutolol Cytochrome P450 targets rdf:type rdfs:label hasPathway rdfs:subClassOf 4 / 31
  • 5. Motivation RDF Triples as Entity and Description tid Subject Predicate Object Provenance t1 s1 p1 o11 dataset1 t2 s1 p1 o15 dataset1 t3 s1 p2 o26 dataset2 t4 s1 rdf:type Drug dataset2 t5 o11 rdf:type C1 dataset3 t6 C1 rdfs:subClassOf C3 dataset3 t7 o26 rdf:type D6 dataset4 t8 D6 rdfs:subClassOf D8 dataset4 ... ... ... ... ... Table : Sample RDF triple store from several resources. Entities S Descriptions Ds s1 {(p1:{C1, C5, C7}), (p2:{D6})} s2 {(p1:{C1, C13, C11}), (p2:{D7})} s3 {(p1:{C5}), (p2:{D7})} s4 {(p1:{C13, C11, C7}), (p2:{D1})} s5 {(p1:{C1, C7}), (p2:{D6})} Table : RDF Triples as entities S and descriptions Ds. 5 / 31
  • 6. FCA: formal context and binary attributes A formal context is a triple pG, M, Iq where G is a set of objects, M a set of attributes, and I a binary relation such as pg, mq P I means that “object g owns attribute m”. A Galois connection characterizes formal concepts: A1 “ tm P M | @g P A Ď G : pg, mq P Iu B1 “ tg P G | @m P B Ď M : pg, mq P Iu pA, Bq is a formal concept with extent A “ B1 and intent B “ A1 , e.g. ptg3, g4, g5u, tm2, m3uq. Concepts are pairs of maximal sets of objects with corre- sponding maximal sets of attributes. m1 m2 m3 g1 ˆ ˆ g2 ˆ ˆ g3 ˆ ˆ g4 ˆ ˆ g5 ˆ ˆ ˆ
  • 7. FCA: concept lattice pA1, B1q ď pA2, B2q ô A1 Ď A2 pô B2 Ď B1q ptg1, g5u, tm1, m3uq ď ptg1, g2, g5u, tm1uq The concept lattice is based on two dual partial orderings: Larger extents and smaller intents are in the “higher levels” of the concept lattice. Smaller extents and larger intents are in the “lower levels” of the concept lattice. Reduced labeling: intents are “inherited” from top to bottom (top-down), and extents are “inherited” from bottom to top (bottom-up).
  • 8. FCA and Pattern Structures Conceptual scaling The formal context is the basic data type of Formal Concept Analysis. However data are often given in form of a many-valued context. Many-valued contexts are translated to one-valued context via conceptual scaling. But this is not automatic and some arbitrary choices have to be made. Examples of scalings: Nominal: K “ pN, N, “q Ordinal: K “ pN, N, ďq Interordinal: K “ pN, N, ď Y ěq 8 / 31
  • 9. A numerical example G / M m1 m2 m3 g1 1 3 4 g2 2 2 3 g3 4 1 1 g4 3 2 1 Nominal Scaling: G / M m1=1 m1=2 m1=4 m2=1 m2=2 m2=3 m3=1 m3=3 m3=4 g1 x x x g2 x x x g3 x x x g4 x x x Interordinal Scaling: G / M m1.lt.1 m1.gt.1 m1.lt.2 m1.gt.2 m1.lt.3 m1.gt.3 m1.lt.4 m1.gt.4 m2.lt.1 m2.gt.1 g1 x x x x x x g2 x x x x x x g3 x x x x x x x g4 x x x x x x
  • 10. Computing similarity between descriptions Intersection considered as a similarity operator: X behaves like a similarity operator: tm1, m2u X tm1, m3u “ tm1u m1 m2 m3 g1 ˆ ˆ g2 ˆ ˆ g3 ˆ ˆ g4 ˆ ˆ g5 ˆ ˆ ˆ X induces a partial ordering relation Ď as follows: S1 X S2 “ S1 ðñ S1 Ď S2 tm1u X tm1, m2u “ tm1u ðñ tm1u Ď tm1, m2u X has the properties of a meet [ in a semi lattice, i.e. a commutative, associative and idempotent operation: c [ d “ c ðñ c Ď d
  • 11. The definition of a Pattern Structure A pattern structure pG, pD, [q, δq is composed of: G a set of objects, pD, [q a semi-lattice of descriptions or patterns, δ a mapping such as δpgq P D describes object g. The Galois connection for pG, pD, [q, δq is defined as: The maximal description representing the similarity of a set of objects: A˝ “ [gPAδpgq for A Ď G The maximal set of objects sharing a given description: d˝ “ tg P G|d Ď δpgqu for d P pD, [q
  • 12. Standard FCA as a Pattern Structure pG, pD, [q, δq Considering a standard formal context pG, M, Iq: G is the set of objects, pD, [q corresponds to ℘pMq where M is the set of attributes. δpgq corresponds to the description of g in terms of attributes. The Galois connection: m1 m2 m3 g1 ˆ ˆ g2 ˆ ˆ g3 ˆ ˆ g4 ˆ ˆ g5 ˆ ˆ ˆ A˝ “ [gPAδpgq for A Ď G tg1, g2u 1 “ g 1 1 X g 1 2 “ tm1, m2u X tm1, m3u “ tm1u d˝ “ tg P G|d Ď δpgqu for d P pD, [q tm1u 1 “ tgi P G|tm1u Ď g 1 i u “ tg1, g2, g5u
  • 13. From FCA to Pattern Structures A formal context pG, M, Iq is based on a set of objects G, a set of attributes M, and a binary relation I Ď G ˆ M. Two derivation operators are defined as follows, @A Ď G, B Ď M: A1 “ tm P M|@g P A, pg, mq P Iu B1 “ tg P G|@m P B, pg, mq P Iu A formal concept pA, Bq verifies A1 “ B and A “ B1. Formal concepts are partially ordered w.r.t. inclusion of extents (or dually of intents): pA1, B1q ď pA2, B2q iff A1 Ď A2 A pattern structure pG, pD, [q, δq is based on a set of objects G, a meet semi-lattice of object descriptions pD, [q, and a mapping δ : G ÝÑ D which associates a description to each object. Two derivation operators are defined as follows, @A Ď G, d P pD, [q: A˝ “ [gPAδpgq d˝ “ tg P G|d Ď δpgq A formal concept pA, dq verifies A˝ “ d and A “ d˝ Pattern concepts are partially ordered w.r.t. inclusion of extents (or dually inclusion of intents): pA1, d1q ď pA2, d2q iff A1 Ď A2
  • 14. Interval Pattern Structure Let D be a set of intervals with integer bounds (for simplicity), let [ be a meet operator defined on D as the convex hull of intervals: ra1, b1s [ ra2, b2s “ rminpa1, a2q, maxpb1, b2qs r4, 5s [ r5, 5s “ r4, 5s ra1, b1s Ď ra2, b2s ðñ ra2, b2s Ď ra1, b1s r4, 5s Ď r5, 5s ðñ r5, 5s Ď r4, 5s
  • 15. Interval Pattern Structure m1 m2 m3 g1 5 7 6 g2 6 8 4 g3 4 8 5 g4 4 9 8 g5 5 8 5 tg1, g2u˝ “ [gPtg1,g2uδpgq “ x5, 7, 6y [ x6, 8, 4y “ xr5, 6s, r7, 8s, r4, 6sy xr5, 6s, r7, 8s, r4, 6sy˝ “ tg P G|xr5, 6s, r7, 8s, r4, 6sy Ď δpgqu “ tg1, g2, g5u ptg1, g2, g5u, xr5, 6s, r7, 8s, r4, 6syq is a pattern concept
  • 16. FCA and Pattern Structures Interval pattern concept lattice ptg1, g2, g5u, xr5, 6s, r7, 8s, r4, 6syq is a pattern concept Highest concepts: largest extents and smallest intents (but the largest intervals), Lowest concepts: smallest extents and largest intents (but the smallest intervals), Problem: efficient pattern mining. 16 / 31
  • 18. RDF Pattern Structures Back to RDF Triple Descriptions tid Subject Predicate Object Provenance t1 s1 p1 o11 dataset1 t2 s1 p1 o15 dataset1 t3 s1 p2 o26 dataset2 t4 s1 rdf:type Drug dataset2 t5 o11 rdf:type C1 dataset3 t6 C1 rdfs:subClassOf C3 dataset3 t7 o26 rdf:type D6 dataset4 t8 D6 rdfs:subClassOf D8 dataset4 ... ... ... ... ... Table : Sample RDF triple store from several resources. Entities S Descriptions Ds s1 {(p1:{C1, C5, C7}), (p2:{D6})} s2 {(p1:{C1, C13, C11}), (p2:{D7})} s3 {(p1:{C5}), (p2:{D7})} s4 {(p1:{C13, C11, C7}), (p2:{D1})} s5 {(p1:{C1, C7}), (p2:{D6})} Table : RDF Triples as entities S and descriptions Ds. 18 / 31
  • 19. RDF Pattern Structures RDF Schema C4 C3 C2 C1 C6 C5 C10 C9 C8 C7 C12 C11 C13 Figure : RDF Schema for p1. D5 D4 D3 D2 D1 D9 D8 D6 D7 Figure : RDF Schema for p2. 19 / 31
  • 20. RDF Pattern Structures Similarity Operation between Two Classes Definition (Least Common Subsumer) Given a partially ordered set pS, ďq, a least common subsumer E of two classes C and D (lcs(C,D) for short) in a partially ordered set is a class such that C ď E and D ď E and E is least i.e., if there is a class E1 such that C ď E1 and D ď E1 then E ď E1. C4 C3 C2 C1 C6 C5 C10 C9 C8 C7 C12 C11 C13 20 / 31
  • 21. RDF Pattern Structures Similarity Operation between Two Classes Definition (Least Common Subsumer) Given a partially ordered set pS, ďq, a least common subsumer E of two classes C and D (lcs(C,D) for short) in a partially ordered set is a class such that C ď E and D ď E and E is least i.e., if there is a class E1 such that C ď E1 and D ď E1 then E ď E1. C4 C3 C2 C1 C6 C5 C10 C9 C8 C7 C12 C11 C13 20 / 31
  • 22. RDF Pattern Structures Similarity Operation between Two Set of classes δps1q “ xp1 : tC1, C5, C7uy δps2q “ xp1 : tC1, C13, C11uy C4 C3 C2 C1 C6 C5 C10 C9 C8 C7 C12 C11 C13 21 / 31
  • 23. RDF Pattern Structures Similarity Operation between Two Set of classes δps1q “ xp1 : tC1, C5, C7uy δps2q “ xp1 : tC1, C13, C11uy C4 C3 C2 C1 C6 C5 C10 C9 C8 C7 C12 C11 C13 C1 , C3 and C4 are comparable i.e., C4 ě C3 ě C1 then the most specific element is considered i.e., C1. 22 / 31
  • 24. RDF Pattern Structures Similarity Operation between Two Set of classes δps1q “ xp1 : tC1, C5, C7uy δps2q “ xp1 : tC1, C13, C11uy C4 C3 C2 C1 C6 C5 C10 C9 C8 C7 C12 C11 C13 δps1q [ δps2q “ xp1 : tC1, C9uy 22 / 31
  • 25. RDF Pattern Structures Similarity Operation between Two Set of classes δps1q “ xp1 : tC1, C5, C7uyxp2 : tD6uy δps2q “ xp1 : tC1, C13, C11uyxp2 : tD7uy C4 C3 C2 C1 C6 C5 C10 C9 C8 C7 C12 C11 C13 D5 D4 D3 D2 D1 D9 D8 D6 D7 Similarity between s1 and s2: δps1q [ δps2q “ xp1 : tC1, C9uy, xp2 : tD8uy. 23 / 31
  • 26. RDF Pattern Structures Building a Pattern Concept Lattice S Ds s1 {(p1:{C1, C5, C7}), (p2:{D6})} s2 {(p1:{C1, C13, C11}), (p2:{D7})} s3 {(p1:{C5}), (p2:{D7})} s4 {(p1:{C13, C11, C7}), (p2:{D1})} s5 {(p1:{C1, C7}), (p2:{D6})} ts1, s2ul “ ę sPts1,s2u δpsq “ δps1q [ δps2q “ xpp1 : tC1, C9uqpp2 : tD8uqy xpp1 : tC1, C9uqpp2 : tD8uqyl “ ts P S|xpp1 : tC1, C9uqpp2 : tD8uqy Ď δpsqu “ ts1, s2, s5u Pattern Concept xts1, s2, s5u, pp1 : tC1, C9uqpp2 : tD8uqy is a pattern concept. 24 / 31
  • 27. RDF Pattern Structures Pattern Concept Lattice K#8 K#4 K#9 K#5 K#6 K#2 K#10 K#11 K#13 K#7 K#3 K#1 K#12 K#0 Figure : Pattern Concept lattice K#ID Extent Intent K#1 {s1} (p1:{C7, C1, C5}), (p2:{D6}) K#2 {s1, s2, s5} (p1:{C1, C9}), (p2:{D8}) K#3 {s2} (p1:{C1, C11, C13}), (p2:{D7}) K#4 {s1, s2, s3, s5} (p1:{C3}), (p2:{D8}) K#5 {s1, s3} (p1:{C5}), (p2:{D8}) K#6 {s2, s3} (p1:{C3}), (p2:{D7}) K#7 {s3} (p1:{C5}), (p2:{D7}) K#8 {s1, s2, s3, s4, s5} (p1:{C4}), (p2:{D5}) K#9 {s1, s2, s4, s5} (p1:{C9}), (p2:{D5}) K#10 {s1, s4, s5} (p1:{C7}), (p2:{D5}) K#11 {s2, s4} (p1:{C11, C13}), (p2:{D5}) K#12 {s4} (p1:{C7, C11, C13}), (p2:{D1}) K#13 {s1, s5} (p1:{C1, C7}), (p2:{D6}) Table : Details of Pattern Concept lattice 25 / 31
  • 28. RDF Pattern Structures Pattern Concept Lattice K#ID Extent Intent K#2 {s1, s2, s5} (p1:{C1, C9}), (p2:{D8}) K#4 {s1, s2, s3, s5} (p1:{C3}), (p2:{D8}) C “ pp1 : tC3uq, pp2 : tD8uq D “ pp1 : tC1, C9uq, pp2 : tD8uq C [ D “ pp1 : tC3uq, pp2 : tD8uq [ pp1 : tC1, C9uq, pp2 : tD8uq C [ D “ pp1 : tlcspC3, C1q, lcspC3, C9quq, pp2 : tlcspD8, D8quq C [ D “ pp1 : tC3, C4uq, pp2 : tD8uq C [ D “ pp1 : tC3uq, pp2 : tD8uq “ C C [ D “ C ô C Ď D @c1 P C, Dd1 P D, d1 ď c1 26 / 31
  • 29. Navigating the Pattern Concept Lattice Navigating Concept Lattice Search for Cardiovascular Agents causing Allergic Conditions. C3 “ Cardiac Disorders, D8 “ Cardiovascular Agents, C9 “ Allergic Conditions K#8 K#4 K#9 K#5 K#6 K#2 K#10 K#11 K#13 K#7 K#3 K#1 K#12 K#0 K#ID Extent Intent K#1 {s1} (p1:{C7, C1, C5}), (p2:{D6}) K#2 {s1, s2, s5} (p1:{C1, C9}), (p2:{D8}) K#3 {s2} (p1:{C1, C11, C13}), (p2:{D7}) K#4 {s1, s2, s3, s5} (p1:{C3}), (p2:{D8}) K#5 {s1, s3} (p1:{C5}), (p2:{D8}) K#6 {s2, s3} (p1:{C3}), (p2:{D7}) K#7 {s3} (p1:{C5}), (p2:{D7}) K#8 {s1, s2, s3, s4, s5} (p1:{C4}), (p2:{D5}) K#9 {s1, s2, s4, s5} (p1:{C9}), (p2:{D5}) K#10 {s1, s4, s5} (p1:{C7}), (p2:{D5}) K#11 {s2, s4} (p1:{C11, C13}), (p2:{D5}) K#12 {s4} (p1:{C7, C11, C13}), (p2:{D1}) K#13 {s1, s5} (p1:{C1, C7}), (p2:{D6}) 27 / 31
  • 30. Navigating the Pattern Concept Lattice Navigating Concept Lattice Search for Cardiovascular Agents causing Allergic Conditions. C3 “ Cardiac Disorders, D8 “ Cardiovascular Agents, C9 “ Allergic Conditions, C1 “ Acute Coronary Syndrome. K#8 K#4 K#9 K#5 K#6 K#2 K#10 K#11 K#13 K#7 K#3 K#1 K#12 K#0 K#ID Extent Intent K#1 {s1} (p1:{C7, C1, C5}), (p2:{D6}) K#2 {s1, s2, s5} (p1:{C1, C9}), (p2:{D8}) K#3 {s2} (p1:{C1, C11, C13}), (p2:{D7}) K#4 {s1, s2, s3, s5} (p1:{C3}), (p2:{D8}) K#5 {s1, s3} (p1:{C5}), (p2:{D8}) K#6 {s2, s3} (p1:{C3}), (p2:{D7}) K#7 {s3} (p1:{C5}), (p2:{D7}) K#8 {s1, s2, s3, s4, s5} (p1:{C4}), (p2:{D5}) K#9 {s1, s2, s4, s5} (p1:{C9}), (p2:{D5}) K#10 {s1, s4, s5} (p1:{C7}), (p2:{D5}) K#11 {s2, s4} (p1:{C11, C13}), (p2:{D5}) K#12 {s4} (p1:{C7, C11, C13}), (p2:{D1}) K#13 {s1, s5} (p1:{C1, C7}), (p2:{D6}) 27 / 31
  • 31. Navigating the Pattern Concept Lattice Navigating Concept Lattice Search for Cardiovascular Agents not causing Allergic Conditions. C3 “ Cardiac Disorders, D8 “ Cardiovascular Agents, C9 “ Allergic Conditions, C1 “ Acute Coronary Syndrome. K#8 K#4 K#9 K#5 K#6 K#2 K#10 K#11 K#13 K#7 K#3 K#1 K#12 K#0 K#ID Extent Intent K#1 {s1} (p1:{C7, C1, C5}), (p2:{D6}) K#2 {s1, s2, s5} (p1:{C1, C9}), (p2:{D8}) K#3 {s2} (p1:{C1, C11, C13}), (p2:{D7}) K#4 {s1, s2, s3, s5} (p1:{C3}), (p2:{D8}) K#5 {s1, s3} (p1:{C5}), (p2:{D8}) K#6 {s2, s3} (p1:{C3}), (p2:{D7}) K#7 {s3} (p1:{C5}), (p2:{D7}) K#8 {s1, s2, s3, s4, s5} (p1:{C4}), (p2:{D5}) K#9 {s1, s2, s4, s5} (p1:{C9}), (p2:{D5}) K#10 {s1, s4, s5} (p1:{C7}), (p2:{D5}) K#11 {s2, s4} (p1:{C11, C13}), (p2:{D5}) K#12 {s4} (p1:{C7, C11, C13}), (p2:{D1}) K#13 {s1, s5} (p1:{C1, C7}), (p2:{D6}) 27 / 31
  • 32. Navigating the Pattern Concept Lattice Navigating Concept Lattice Search for Cardiovascular Agents not causing Allergic Conditions. C3 “ Cardiac Disorders, D8 “ Cardiovascular Agents, C9 “ Allergic Conditions, C1 “ Acute Coronary Syndrome. K#8 K#4 K#9 K#5 K#6 K#2 K#10 K#11 K#13 K#7 K#3 K#1 K#12 K#0 K#ID Extent Intent K#1 {s1} (p1:{C7, C1, C5}), (p2:{D6}) K#2 {s1, s2, s5} (p1:{C1, C9}), (p2:{D8}) K#3 {s2} (p1:{C1, C11, C13}), (p2:{D7}) K#4 {s1, s2, s3, s5} (p1:{C3}), (p2:{D8}) K#5 {s1, s3} (p1:{C5}), (p2:{D8}) K#6 {s2, s3} (p1:{C3}), (p2:{D7}) K#7 {s3} (p1:{C5}), (p2:{D7}) K#8 {s1, s2, s3, s4, s5} (p1:{C4}), (p2:{D5}) K#9 {s1, s2, s4, s5} (p1:{C9}), (p2:{D5}) K#10 {s1, s4, s5} (p1:{C7}), (p2:{D5}) K#11 {s2, s4} (p1:{C11, C13}), (p2:{D5}) K#12 {s4} (p1:{C7, C11, C13}), (p2:{D1}) K#13 {s1, s5} (p1:{C1, C7}), (p2:{D6}) 27 / 31
  • 33. Navigating the Pattern Concept Lattice Navigating Concept Lattice Search for Cardiovascular Agents not causing Allergic Conditions. C3 “ Cardiac Disorders, D8 “ Cardiovascular Agents, C9 “ Allergic Conditions, C1 “ Acute Coronary Syndrome. K#8 K#4 K#9 K#5 K#6 K#2 K#10 K#11 K#13 K#7 K#3 K#1 K#12 K#0 K#ID Extent Intent K#1 {s1} (p1:{C7, C1, C5}), (p2:{D6}) K#2 {s1, s2, s5} (p1:{C1, C9}), (p2:{D8}) K#3 {s2} (p1:{C1, C11, C13}), (p2:{D7}) K#4 {s1, s2, s3, s5} (p1:{C3}), (p2:{D8}) K#5 {s1, s3} (p1:{C5}), (p2:{D8}) K#6 {s2, s3} (p1:{C3}), (p2:{D7}) K#7 {s3} (p1:{C5}), (p2:{D7}) K#8 {s1, s2, s3, s4, s5} (p1:{C4}), (p2:{D5}) K#9 {s1, s2, s4, s5} (p1:{C9}), (p2:{D5}) K#10 {s1, s4, s5} (p1:{C7}), (p2:{D5}) K#11 {s2, s4} (p1:{C11, C13}), (p2:{D5}) K#12 {s4} (p1:{C7, C11, C13}), (p2:{D1}) K#13 {s1, s5} (p1:{C1, C7}), (p2:{D6}) 27 / 31
  • 34. Experiments and Conclusion Experimentation Coded in C++. Real datasets Biomedical Data and DBLP. Subsets of drug data were considered, i.e., Cardiovascular Agents and Central Nervous System. The selected datasets: Drugbank (RDF triples) Sider (RDF triples) MedDRA (RDF Schema - BioPortal) MeSH (RDF Schema - Bio2RDF) 28 / 31
  • 35. Experiments and Conclusion Experimentation Datasets No. of Triples No. of Subjects No. of Objects Runtime Cardiovascular Agents 31098 145 927 0-22 sec Central Nervous System 22680 105 1050 0-25 sec Table : Statistics of two datasets and index lattice. (a) Index Size for Drugbank Dataset Index Size Reduction Hiding non-interesting parts. Removing general classes from RDF Schema. Support. Stability. 29 / 31
  • 36. Experiments and Conclusion Conclusion and Future Work Obtained index provide the following benefits: Classification of RDF triples with respect to RDF Schema. Simultaneous access over RDF triples as well as RDF schema. One platform for navigating and querying several resources. Heterogeneous data i.e., when the reference schema is not present such as proteins can be easily dealt with. Results are visualized using RV-Xplorer1 . Future Work: Provide a better formalization of the framework. Provide more realistic usecases. Define a new similarity measure to deal with graph-based structures. 1 A new tool developed by our team for allowing lattice interaction, (Alam et. al. International Conference on Concept Lattice and their Applications, 2015.) 30 / 31