SlideShare a Scribd company logo
Linked Open Data
SIKS course on Data Science

May 20, 2016 Vught.

Laura Hollink
Why do we create and use Linked Open Data?
Example questions from
the humanities and
social sciences
How did the debate about
the financial crisis in
Greece develop?
Searching the proceedings of the EU Parliament
"Greece" in the plenary meetings of the European Parliament
Year
Nr.ofmentions
050100150200
1999 2000 2001 2001 2002 2003 2004 2005 2006 2006 2007 2008 2009 2010 2010 2011 2012 2013
Searching through newspaper archives
Mentions of “Griekenland” in the Dutch newspaper the Telegraaf.
Search volumes on a search engine
Query = “Greece”
http://www.google.com/trends
Search volumes on a search engine
Query = “Greece”
http://www.google.com/trends
We need access to data. Analysing
them gives us some useful insight.
But to answer the question properly
we would need to combine sources
and do more complex queries.
Why do we create and use Linked Open Data?
Example question 2 

Which political debate in the
post-war period has attracted
most media attention?
“De Indonesische Quaestie"
“De Indonesische Quaestie"
To answer this question we need to
go through all newspaper articles
about all political debates.
-> we need access to combined
data sources, we need
structured queries.
Why do we create and use Linked Open Data?
Why do we create and use Linked Open Data?
Example question 3
What are the differences
between different media?

Example question 4
Has the coverage changed
over time?
Research goals and research questions
Our goal is to build an infrastructure to answer these kinds of questions.

1. How do we automatically link heterogeneous datasets?

2. How do we interpret links between datasets of different quality and certainty?

3. What can we conclude from usage statistics on these datasets?

4. Can we design interfaces that allow scholars to study the datasets

• including the links between them?

• while assessing the reliability of the findings?
Research goals and research questions
Our goal is to build an infrastructure to answer these kinds of questions.

1. How do we automatically link heterogeneous datasets?

2. How do we interpret links between datasets of different quality and certainty?

3. What can we conclude from usage statistics on these datasets?

4. Can we design interfaces that allow scholars to study the datasets

• including the links between them?

• while assessing the reliability of the findings?
Data Science - Big Data - Linked Open Data
Table of Contents
1. What is Linked Open Data (LOD)
2. Creating LOD
1. How to discover links
2. How to represent links on the Web
3. How to evaluate links
3. Access to LOD (from both the server and the client
perspective)
What is Linked Open Data?
What is Linked Open Data?
What is Linked Open Data?
A method of publishing structured data on the Web
in such a way that it can be linked and queried
by computers as well as humans.
The Web of Documents
The Web of Documents
• Documents	
  identified	
  by	
  URIs	
  (html,	
  pdf,	
  images,	
  movies,	
  etc.)	
  
• with	
  structured	
  information	
  for	
  humans	
  (tables,	
  headers)	
  and	
  
• with	
  hyperlinks	
  between	
  them	
  
• The	
  data	
  is	
  not	
  machine	
  readable,	
  meant	
  for	
  humans	
  
• structure	
  is	
  implicit	
  (what	
  do	
  the	
  columns	
  of	
  a	
  table	
  mean?)	
  
• links	
  are	
  not	
  typed	
  (what	
  is	
  the	
  relation	
  between	
  two	
  documents?)	
  
The Web of Data
The Web of Data
• Everything	
  identified	
  by	
  URIs	
  (not	
  just	
  documents,	
  but	
  also	
  classes,	
  
instances,	
  relations/links)	
  
• The	
  data	
  is	
  machine	
  readable:	
  	
  
• in	
  formal	
  languages	
  (RDF,	
  RDFS,	
  OWL,	
  SKOS)	
  	
  
• which	
  enable	
  machines	
  to	
  do	
  reasoning,	
  i.e.	
  infer	
  new	
  statements	
  
from	
  inserted	
  statements.	
  
Compared to a database table…
Amsterdam
has population
“1364422” City Schiphol
is a has airport
Thing Type Population Airport
Amsterdam City 1364422 Schiphol
…. … …. …
Compared to a database table…
Amsterdam
has population
“1364422” City Schiphol
is a has airport
Differences:

• Statements can be distributed over the web

• Non-unique naming assumption

• Open World assumption

• Everyone can say anything about anything
Thing Type Population Airport
Amsterdam City 1364422 Schiphol
…. … …. …
Compared to a database table…
Amsterdam
has population
“1364422” City Schiphol
is a has airport
Examples of URIs on the Web of Data
• documents:
• http://vu.nl/index.html

• http://example.org/cities#Leuven

• real world objects (a book in the library, a person)
• isbn://5031-4444-333

• http://eyaloren.org/foaf.rdf#me

• concepts:
• http://cyc.org/concept/Mammal 

• http://cyc.org/concept/Dog 

• www.w3.org/2006/03/wn/wn20/instances/synset-anniversary-noun-1

• relations:
• http://purl.org/linkedpolitics/vocabulary/speaker
RDF (the basics)
• A W3C recommendation to
describe resources on the Web
of Data called “Resource
description Framework”

• See https://www.w3.org/RDF/ 

• RDF data model: triples!
RDF (the basics)
• A W3C recommendation to
describe resources on the Web
of Data called “Resource
description Framework”

• See https://www.w3.org/RDF/ 

• RDF data model: triples!
RDF (the basics)
• A W3C recommendation to
describe resources on the Web
of Data called “Resource
description Framework”

• See https://www.w3.org/RDF/ 

• RDF data model: triples!
RDF example in Turtle syntax:
<bob#me>
a foaf:Person ;
foaf:knows <alice#me> ;
schema:birthDate "1990-07-04"^^xsd:date ;
foaf:topic_interest wd:Q12418 .
Vocabulary definition and reasoning with RDFS
B
C
r
A
data level
ontology / vocabulary /
schema level
Vocabulary definition and reasoning with RDFS
A
B
C
IF
B rdfs:subClassOf A
C rdfs:subClassOf B
THEN
C rdfs:subClassOf A
B
C
r
A
data level
ontology / vocabulary /
schema level
Vocabulary definition and reasoning with RDFS
A
B
A
B
C
IF
B rdfs:subClassOf A
C rdfs:subClassOf B
THEN
C rdfs:subClassOf A
IF
C rdfs:subClassOf B
r rdf:type C
THEN
r rdf:type B
B
C
r
A
data level
ontology / vocabulary /
schema level
Vocabulary definition and reasoning with RDFS
A
B
A
B
C
IF
B rdfs:subClassOf A
C rdfs:subClassOf B
THEN
C rdfs:subClassOf A
IF
B rdfs:subClassOf A
r rdf:type B
THEN
r rdf:type A
<bob#me> rdf:type foaf:Person .
foaf:Person rdfs:subClassOf foaf:Agent .
Vocabulary definition and reasoning with RDFS
A
B
A
B
C
IF
B rdfs:subClassOf A
C rdfs:subClassOf B
THEN
C rdfs:subClassOf A
IF
B rdfs:subClassOf A
r rdf:type B
THEN
r rdf:type A
<bob#me> rdf:type foaf:Person .
foaf:Person rdfs:subClassOf foaf:Agent .
<bob#me> a foaf:Agent .
Vocabulary definition and reasoning with RDFS
A
B
A
B
C
IF
B rdfs:subClassOf A
C rdfs:subClassOf B
THEN
C rdfs:subClassOf A
IF
B rdfs:subClassOf A
r rdf:type B
THEN
r rdf:type A
<bob#me> rdf:type foaf:Person .
foaf:Person rdfs:subClassOf foaf:Agent .
<bob#me> a foaf:Agent .
Standard meaning
Vocabulary definition and reasoning with RDFS
IF
p rdfs:range R
A p B
THEN
B rdf:type R
<bob#me> foaf:knows <alice#me> .
foaf:knows rdfs:range foaf:Person .
Vocabulary definition and reasoning with RDFS
IF
p rdfs:range R
A p B
THEN
B rdf:type R
<bob#me> foaf:knows <alice#me> .
foaf:knows rdfs:range foaf:Person .
<alice#me> rdf:type foaf:Person .
Vocabulary definition and reasoning with RDFS
IF
p rdfs:range R
A p B
THEN
B rdf:type R
<bob#me> foaf:knows <alice#me> .
foaf:knows rdfs:range foaf:Person .
<alice#me> rdf:type foaf:Person .
Standard meaning
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
Query: :JamesDean :playedIn ?what .
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
Query: :JamesDean :playedIn ?what .
Answer: :Giant
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
Query: :JamesDean :playedIn ?what .
Answer: :Giant
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
Query: :JamesDean :playedIn ?what .
Answer: :Giant
Query: ?who :playedIn :Giant.
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
Query: :JamesDean :playedIn ?what .
Answer: :Giant
Query: ?who :playedIn :Giant.
Answer: :JamesDean
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
Query: :JamesDean :playedIn ?what .
Answer: :Giant
Query: ?who :playedIn :Giant.
Answer: :JamesDean
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
Query: :JamesDean :playedIn ?what .
Answer: :Giant
Query: ?who :playedIn :Giant.
Answer: :JamesDean
Query: :JamesDean ?what :Giant.
SPARQL (the basics)
• A W3C recommendation for querying RDF graphs called “SPARQL Protocol
And RDF Query Language”

• See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/
sparql11-query/
Data: :JamesDean :playedIn :Giant .
Query: :JamesDean :playedIn ?what .
Answer: :Giant
Query: ?who :playedIn :Giant.
Answer: :JamesDean
Query: :JamesDean ?what :Giant.
Answer: :playedIn
Linked Open Data
A method of publishing on the Web of Data: openly
available, in RDF, with links to other datasets.
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
Linked Open Data
A method of publishing on the Web of Data: openly
available, in RDF, with links to other datasets.
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
Creating Linked Open Data
in the Talk of Europe project:
Discovering links, knowledge representation
Creating Linked Open Data
in the Talk of Europe project:
Discovering links, knowledge representation
The European Parliament as Linked Open Data
Laura Hollink	 	 Centrum Wiskunde & Informatica, Amsterdam
Astrid van Aggelen 	 VU University Amsterdam
Martijn Kleppe	 	 Erasmus University Rotterdam
Henri Beunders Erasmus University Rotterdam
Jill Briggeman Erasmus University Rotterdam
Max Kemman	 	 University of Luxembourg
Talk of Europe goals
• To publish the entire plenary debates of the European
Parliament as Linked Open Data

• To improve access to the data

• To enable large scale analysis across time spans.

‣To residents of the European Union access to the proceedings
of the European parliament is a formal right.
A. van Aggelen, L. Hollink, M.
Kemman, M. Kleppe & H. Beunders.
The debates of the European
Parliament as Linked Open Data.
Semantic Web Journal. In press, 2016.
1. Data in RDF
1. Data in RDF
1. Data in RDF
14M RDF statements about the 30K
speeches in 23 languages by 3K
speakers in 1K session days that
were held in the EU parliament
between 1999 and 2014
2. Links to external datasets
•
2. Links to external datasets
•
2. Links to external datasets
•
Example 1: speeches that contain a certain keyword
Query: all speeches that contain the phrase “open data”
…. So let us go for open data, let us
go for utilisation of all the instruments
available to that end! …..
…. but there too governments are
encouraging the use of open data to
increase transparency, accountability
and citizen participation ….
…. We already have many open data
projects in the Member States and
local authorities…..
Example 2: speeches that contain a certain
keyword by date
"Slovenia" in the plenary meetings of the European Parliament
Year
Nr.ofmentions
020406080100
1999 2000 2001 2003 2004 2005 2006 2007 2008 2010 2011 2012 2013
Example 2: speeches that contain a certain
keyword by date
"Slovenia" in the plenary meetings of the European Parliament
Year
Nr.ofmentions
020406080100
1999 2000 2001 2003 2004 2005 2006 2007 2008 2010 2011 2012 2013
Example 2: speeches that contain a certain keyword
by date
Mentions of 'human rights'
dates
Frequency
0200400600800
1999 2000 2001 2003 2004 2005 2006 2007 2009 2010 2011 2012 2013
Example 3: speeches that contain a certain keyword
by country
AT BE BG CY CZ DE DK EE ES FI FR GB GR HR HU IE IT LT LU LV MT NL PL PT RO SE SI SK
Mentions of 'human rights' by country
01000200030004000500060007000
Example 4: the number of speeches per EU
country
SELECT ?c (COUNT(?c) as ?count) 

WHERE { 

	 ?x rdf:type <http://purl.org/linkedpolitics/vocabulary/eu/plenary/Speech>. 

	 ?x <http://purl.org/linkedpolitics/vocabulary#speaker> ?p. 

	 ?p <http://purl.org/linkedpolitics/vocabulary#countryOfRepresentation> ?c

} GROUP BY ?c LIMIT 50
Example 5: background info about the MEPs
• MEPs that were not born in Europe.
Members of Parliament
Example 5: background info about the MEPs
• MEPs that were not born in Europe.
Members of Parliament
Example 5: background info about the MEPs
• MEPs that were not born in Europe.
Members of Parliament
Example 5: background info about the MEPs
• MEPs that were not born in Europe.
Members of Parliament
Example 5: background info about the MEPs
• MEPs that were not born in Europe.
Members of Parliament
Example 5: background info about the MEPs
• MEPs that were not born in Europe.
Members of Parliament
Integrate data from
the EU parliament
with external datasets
Linking Members of Parliament to Wikipedia /
DBpedia
Linking Members of Parliament to Wikipedia /
DBpedia
Linking Members of Parliament to Wikipedia /
DBpedia
Linking Members of Parliament to Wikipedia /
DBpedia
• String matching is the most important feature in the linking process.

• “nearly all [alignment systems] use a string similarity metric” [12]

• stopping and stemming is not helpful! Nor is using WordNet synonyms. [12]
[12] Cheatham, M., & Hitzler, P. String
similarity metrics for ontology alignment.
ISWC 2013.
http://www.dbpedia.org/resource/Judith_Sargentini
Linking Members of Parliament to Wikipedia /
DBpedia
• String matching is the most important feature in the linking process.

• “nearly all [alignment systems] use a string similarity metric” [12]

• stopping and stemming is not helpful! Nor is using WordNet synonyms. [12]
[12] Cheatham, M., & Hitzler, P. String
similarity metrics for ontology alignment.
ISWC 2013.
http://www.dbpedia.org/resource/Judith_Sargentini
How to relate a speech to a speaker and party?
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:speaker
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:speaker
lp:EUParty/SomeParty
lpv:hasParty
How to relate a speech to a speaker and party?
Why is this not a good solution?
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:speaker
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:speaker
lp:EUParty/SomeParty
lpv:hasParty
How to relate a speech to a speaker and party?
Why is this not a good solution?
1. A person might be a member of more than one party (at different times)
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:speaker
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:speaker
lp:EUParty/SomeParty
lpv:hasParty
How to relate a speech to a speaker and party?
Why is this not a good solution?
1. A person might be a member of more than one party (at different times)
2. Since there is no link between a speech and a party, queries for all speeches
spoken by the members of a certain party become very complicated.
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:speaker
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:speaker
lp:EUParty/SomeParty
lpv:hasParty
How to relate a speech to a speaker and party?
"20111126"^ xsd:date
"20090716"^ xsd:date
lp:political-
Function102
lpv:beginning
lpv:end
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:role
lp:EUCommittee/
Committee_on_Legal_Affairs
lp:Role/substitute
lpv:political
Function
lpv:institution
lpv:speaker
How to relate a speech to a speaker and party?
"20111126"^ xsd:date
"20090716"^ xsd:date
lp:political-
Function102
lpv:beginning
lpv:end
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:role
lp:EUCommittee/
Committee_on_Legal_Affairs
lp:Role/substitute
lpv:political
Function
lpv:institution
lpv:speaker
"20111126"^ xsd:date
lp:political-
Function101
lpv:end
"20111126"^
xsd:date
lpv:beginning
"20071114"
^xsd:date
lpv:PoliticalFunction
"20090716"^ xsd:date
lp:political-
Function102
lpv:beginning
lpv:end
lp:EUmember_1023
lp:political
Function
lp:eu/plenary/2009-10-21/Speech_140>
lpv:role
lp:EUCommittee/
Committee_on_Legal_Affairs
lp:Role/substitutelp:Role/member
lp:EUParty/NI
lpv:role
lpv:political
Function
lpv:institutionlpv:institution rdf:type
lpv:speaker
rdf:type
How to relate a speech to a speaker and party?
"20111126"^ xsd:date
"20090716"^ xsd:date
lp:political-
Function102
lpv:beginning
lpv:end
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:role
lp:EUCommittee/
Committee_on_Legal_Affairs
lp:Role/substitute
lpv:political
Function
lpv:institution
lpv:speaker
"20111126"^ xsd:date
lp:political-
Function101
lpv:end
"20111126"^
xsd:date
lpv:beginning
"20071114"
^xsd:date
lpv:PoliticalFunction
"20090716"^ xsd:date
lp:political-
Function102
lpv:beginning
lpv:end
lp:EUmember_1023
lp:political
Function
lp:eu/plenary/2009-10-21/Speech_140>
lpv:role
lp:EUCommittee/
Committee_on_Legal_Affairs
lp:Role/substitutelp:Role/member
lp:EUParty/NI
lpv:role
lpv:political
Function
lpv:institutionlpv:institution rdf:type
lpv:speaker
rdf:type
"20111126"^ xsd:date
lp:political-
Function101
lpv:end
"20111126"^
xsd:date
lpv:beginning
"20071114"
^xsd:date
lpv:PoliticalFunction
"20090716"^ xsd:date
lp:political-
Function102
lpv:beginning
lpv:end
lp:EUmember_1023
lp:political
Function
lp:eu/plenary/2009-10-21/Speech_140>
lpv:role
lp:EUCommittee/
Committee_on_Legal_Affairs
lp:Role/substitutelp:Role/member
lp:EUParty/NI
lpv:role
lpv:political
Function
lpv:institutionlpv:institution rdf:type
lpv:spokenAs
lpv:speaker
lpv:spokenAs
rdf:type
How to relate a speech to a speaker and party?
"20111126"^ xsd:date
"20090716"^ xsd:date
lp:political-
Function102
lpv:beginning
lpv:end
lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140>
lpv:role
lp:EUCommittee/
Committee_on_Legal_Affairs
lp:Role/substitute
lpv:political
Function
lpv:institution
lpv:speaker
"20111126"^ xsd:date
lp:political-
Function101
lpv:end
"20111126"^
xsd:date
lpv:beginning
"20071114"
^xsd:date
lpv:PoliticalFunction
"20090716"^ xsd:date
lp:political-
Function102
lpv:beginning
lpv:end
lp:EUmember_1023
lp:political
Function
lp:eu/plenary/2009-10-21/Speech_140>
lpv:role
lp:EUCommittee/
Committee_on_Legal_Affairs
lp:Role/substitutelp:Role/member
lp:EUParty/NI
lpv:role
lpv:political
Function
lpv:institutionlpv:institution rdf:type
lpv:speaker
rdf:type
"20111126"^ xsd:date
lp:political-
Function101
lpv:end
"20111126"^
xsd:date
lpv:beginning
"20071114"
^xsd:date
lpv:PoliticalFunction
"20090716"^ xsd:date
lp:political-
Function102
lpv:beginning
lpv:end
lp:EUmember_1023
lp:political
Function
lp:eu/plenary/2009-10-21/Speech_140>
lpv:role
lp:EUCommittee/
Committee_on_Legal_Affairs
lp:Role/substitutelp:Role/member
lp:EUParty/NI
lpv:role
lpv:political
Function
lpv:institutionlpv:institution rdf:type
lpv:spokenAs
lpv:speaker
lpv:spokenAs
rdf:type
Note: this is a common “design pattern”
referred to as n-ary relations or
relations as classes
Intermezzo: one-question Quiz
Reasoning on the Web of Data
Question: What can we conclude from this graph?

A. Stihler is a member of exactly 3 parties

B. Stihler is a member of at least 3 parties

C. Stihler is a member of at most 3 parties

D. None of the above

E. All of the above

F. Other, namely ….
http://purl.org/linkedpolitics/EUmember_4545 "Catherine Stihler"foaf:name
http://purl.org/linkedpolitics/EUParty/PES
http://dbpedia.org/resource/
Party_of_European_Socialists
http://dbpedia.org/resource/
Progressive_Alliance_of_Socialists_and_Democrats
:memberOf
:memberOf
:memberOf
Creating Linked Open Data
in the PoliMedia project:
Discovering links, knowledge representation, evaluation
Creating Linked Open Data
in the PoliMedia project:
Discovering links, knowledge representation, evaluation
Linking government data
to news data
Linked Open Data
Which political debate in
the post-war period has
attracted most media
attention?

What are the differences
between different media?

Has the coverage changed
over time?
Transcriptions of all 9,294
meetings of the Dutch
parliament between
1945-1995, consisting of
1,208,903 speeches.
Roughly 1.8 Million news
bulletins between
1937-1984

(We only use 1945-1995)
Archives of hundreds of
newspaper with tons of
newspaper issues or 10’s
of Millions of articles
between 1618-1995.

(We only use 1945-1995)
Transcriptions of all
meetings of the
European Parliament
between 1999 and
2014.
Linked Open Data
Links in PoliMedia
is about
• 3 Million links
Discovering links between politics and news
Detect
topics in
speeches
Create
queries
Search
newspaper
archive
Topics
Named
Entities
Name of
speaker
Detect
Named
Entities in
speeches
Candidate
articles
Queries
Rank
candidate
articles
Links
between
speeches
and articles
Debates
Date of
debate
Step 2: generate links
Detect
topics in
speeches
Create
queries
Search
newspaper
archive
Topics
Named
Entities
Name of
speaker
Detect
Named
Entities in
speeches
Candidate
articles
Queries
Rank
candidate
articles
Links
between
speeches
and articles
Debates
Date of
debate
Intuition 1: The name of the speaker should
appear in the article and the article should
be published within a week of the debate
Step 2: generate links
Detect
topics in
speeches
Create
queries
Search
newspaper
archive
Topics
Named
Entities
Name of
speaker
Detect
Named
Entities in
speeches
Candidate
articles
Queries
Rank
candidate
articles
Links
between
speeches
and articles
Debates
Date of
debate
Intuition 1: The name of the speaker should
appear in the article and the article should
be published within a week of the debate
Intuition 2: the more the article and the
speech overlap in terms of topics and
named entities, the more they are related.
Representation of links
:speech123:newsArticle456 :isAbout
Representation of links
• Note: this is another
example of
the“design pattern”
referred to as n-ary
relations or relations
as classes!

• It allows us to save
provenance
information about
the statements we
create.
:speech123:newsArticle456 :isAbout
Representation of links
• Note: this is another
example of
the“design pattern”
referred to as n-ary
relations or relations
as classes!

• It allows us to save
provenance
information about
the statements we
create.
:speech123:newsArticle456 :isAbout
:speech123
:newsArticle456
:link001
01-02-2013 :PoliMedia_Linking_Engine
:quotes
:concept1
:concept2
link type
:madeBy:creationDate
Evaluation of links
Evaluation of links
1. Manually rating (a sample of) links

• relatively cheap and easy to interpret

• only precision, no recall
Evaluation of links
1. Manually rating (a sample of) links

• relatively cheap and easy to interpret

• only precision, no recall
2. Comparison to a reference linkset

• precision and recall

• used in OAEI on the SEALS platform

• more expensive if a reference alignment has to be
created (but: crowd sourcing!)
Evaluation of links
1. Manually rating (a sample of) links

• relatively cheap and easy to interpret

• only precision, no recall
2. Comparison to a reference linkset

• precision and recall

• used in OAEI on the SEALS platform

• more expensive if a reference alignment has to be
created (but: crowd sourcing!)
3. End-to-end evaluation (a.k.a. evaluating an application
that uses the mappings)

• arguably the best method!

• need to have access to an application + users
Evaluation of links: beyond precision / recall
B
C
r
A
data level
ontology / vocabulary /
schema level
Evaluation of links: beyond precision / recall
Generalized precision and Generalized recall

• Instead of a binary classification into correct/
incorrect mappings, take into account how wrong
an link is:

• where r(a) is the semantic distance between
correspondence a and correspondence a’ in the
reference alignment, A is the number of
correspondences.
Laura Hollink, Mark van Assem, Shenghui
Wang, Antoine Isaac, Guus Schreiber. Two
Variations on Ontology Alignment
Evaluation: Methodological Issues.ESWC
2008.
B
C
r
A
data level
ontology / vocabulary /
schema level
Evaluation of links in PoliMedia
How good are the links?
• We ask 2 raters to manually score pairs of
newspaper articles and speeches.

• a pilot study showed that we needed
more than a 2 point scale.

• inter-rater agreement: 0.5 ->
acceptable, but not high.

• Precision: 80%
Evaluation of links in PoliMedia
Setting 1 Setting 2 Setting 3
0,48 0,62 0,8
How good are the links?
• We ask 2 raters to manually score pairs of
newspaper articles and speeches.

• a pilot study showed that we needed
more than a 2 point scale.

• inter-rater agreement: 0.5 ->
acceptable, but not high.

• Precision: 80%
Evaluation of links in PoliMedia
Setting 1 Setting 2 Setting 3
0,48 0,62 0,8
How many links did we miss?
• We ask the raters to
manually search the KB
archives for related
articles.

• Recall: 62%
How good are the links?
• We ask 2 raters to manually score pairs of
newspaper articles and speeches.

• a pilot study showed that we needed
more than a 2 point scale.

• inter-rater agreement: 0.5 ->
acceptable, but not high.

• Precision: 80%
DEMO - PoliMedia search application
Linked Open Data
Linked Open Data
Linked Open Data
Linked Open Data
Linked Open Data
Online database:
“SPARQL endpoint”
• A service to query a knowledge
base using the SPARQL query
language.

“All speeches with more
than 60 associated news
items.”
Access to Linked Open Data: how to serve and
how to consume Linked Open Data
Access to Linked Open Data: how to serve and
how to consume Linked Open Data
Access to LOD: 1. download a data dump
Access to LOD: 1. download a data dump
From server logs we know the query
-some context of the requested URIs
-variable names (?)
Access to LOD 2: follow-your-nose
Access to LOD 2: follow-your-nose
lp:eu/plenary/
2013-11-20/AgendaItem_6
Access to LOD 2: follow-your-nose
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/
2013-11-20/Speech_103
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
dc:hasPart
Access to LOD 2: follow-your-nose
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/
2013-11-20/Speech_103
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
dc:hasPart
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
dc:hasPart
Access to LOD 2: follow-your-nose
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/
2013-11-20/Speech_103
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
dc:hasPart
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
dc:hasPart
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
...Ich glaube, das war ein
außergewöhnlicher Moment
für uns alle hier in diesem
Parlament[...]"@en
lpv:spokenText
lpv:speaker
dc:hasPart
lp:Martin_Schulz
Access to LOD 2: follow-your-nose
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/
2013-11-20/Speech_103
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
dc:hasPart
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
dc:hasPart
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
...Ich glaube, das war ein
außergewöhnlicher Moment
für uns alle hier in diesem
Parlament[...]"@en
lpv:spokenText
lpv:speaker
dc:hasPart
lp:Martin_Schulz
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
...Ich glaube, das war ein
außergewöhnlicher Moment
für uns alle hier in diesem
Parlament[...]"@en
lpv:spokenText
lpv:speaker
owl:sameAs
http:://dbpedia.org/
resource/Martin_Schulz
dc:hasPart
lp:Martin_Schulz
Access to LOD 2: follow-your-nose
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/
2013-11-20/Speech_103
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
dc:hasPart
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
dc:hasPart
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
...Ich glaube, das war ein
außergewöhnlicher Moment
für uns alle hier in diesem
Parlament[...]"@en
lpv:spokenText
lpv:speaker
dc:hasPart
lp:Martin_Schulz
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
...Ich glaube, das war ein
außergewöhnlicher Moment
für uns alle hier in diesem
Parlament[...]"@en
lpv:spokenText
lpv:speaker
owl:sameAs
http:://dbpedia.org/
resource/Martin_Schulz
dc:hasPart
lp:Martin_Schulz
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
...Ich glaube, das war ein
außergewöhnlicher Moment
für uns alle hier in diesem
Parlament[...]"@en
lpv:spokenText
lpv:speaker
owl:sameAs
http:://dbpedia.org/
resource/Martin_Schulz
dc:hasPart
lp:Martin_Schulz
dbp:children
"2"
lpv:speaker
dbc:Officiers_of_the_Légion_d'honneur
Access to LOD 2: follow-your-nose
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/
2013-11-20/Speech_103
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
dc:hasPart
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
dc:hasPart
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
...Ich glaube, das war ein
außergewöhnlicher Moment
für uns alle hier in diesem
Parlament[...]"@en
lpv:spokenText
lpv:speaker
dc:hasPart
lp:Martin_Schulz
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
...Ich glaube, das war ein
außergewöhnlicher Moment
für uns alle hier in diesem
Parlament[...]"@en
lpv:spokenText
lpv:speaker
owl:sameAs
http:://dbpedia.org/
resource/Martin_Schulz
dc:hasPart
lp:Martin_Schulz
lp:eu/plenary/
2013-11-20/Speech_103
...the fittest need to
struggle for the
survival of the
weak.[...]"@en
lpv:spokenText
lpv:speaker
lp:Speaker_Malala_Yousafzai
"Award of the Sakharov Prize (formal sitting)."@en
dc:title
dc:hasPart
lp:eu/plenary/
2013-11-20/AgendaItem_6
lp:eu/plenary/2013-11-20/
Speech_104
lpv:has
Subsequent
...Ich glaube, das war ein
außergewöhnlicher Moment
für uns alle hier in diesem
Parlament[...]"@en
lpv:spokenText
lpv:speaker
owl:sameAs
http:://dbpedia.org/
resource/Martin_Schulz
dc:hasPart
lp:Martin_Schulz
dbp:children
"2"
lpv:speaker
dbc:Officiers_of_the_Légion_d'honneur
From server logs we know the requested URI:

GET /Martin_Schulz HTTP/1.0 Accept: application/rdf+xml
Count the agenda items in which at least one MEP from
France spoke out.
Access to LOD: 3. SPARQL
SELECT (COUNT (DISTINCT ?ai) as ?count)
WHERE {
?ai rdf:type <http://purl.org/linkedpolitics/vocabulary/eu/
plenary/AgendaItem
?ai dcterms:hasPart ?speech.
?speech lpv:speaker ?speaker.
?speaker lpv:countryOfRepresentation ?country.
?country rdfs:label ?label.
filter(?label="France"@en)
}
Linked Open Data
From server logs we know the query
-some context of the requested URIs
-variable names (?)
Linked Open Data
Linked Open Data
Linked Open Data
Access to LOD: 4. Linked Data Fragments
xxx.xxx.xxx.xxx - - [17/Oct/2014:07:43:02 +0000] 

"GET /2014/en?subject=&predicate=&object=dbpedia%3AAustin
HTTP/1.1" 200 1309 "http://fragments.dbpedia.org/2014/en"
…
Access to LOD: 4. Linked Data Fragments
xxx.xxx.xxx.xxx - - [17/Oct/2014:07:43:02 +0000] 

"GET /2014/en?subject=&predicate=&object=dbpedia%3AAustin
HTTP/1.1" 200 1309 "http://fragments.dbpedia.org/2014/en"
…
From server logs we know the triple patterns that were
requested
-some context of the requested URIs
-variable names (?)
What do we know about usage of Linked Open
Data?
What do we know about usage of Linked Open
Data?
1. Yearly datasets of server logs released for research purposes, 2011-2016

Luczak-Roesch, Markus, Aljaloud, Saud, Berendt, Bettina and Hollink, Laura (2016)
USEWOD 2016 Research Dataset. doi:10.5258/SOTON/385344

2. Yearly workshops for researchers on Usage Data and the Web of Data, 2011-2016

Laura Hollink, Markus Luczak-Roesch, Bettina Berendt, et al.

http://usewod.org/
USEWOD2011
2016
Linked Open Data query log analysis?
1. Yearly datasets of server logs released for research purposes, 2011-2016

Luczak-Roesch, Markus, Aljaloud, Saud, Berendt, Bettina and Hollink, Laura (2016)
USEWOD 2016 Research Dataset. doi:10.5258/SOTON/385344

2. Yearly workshops for researchers on Usage Data and the Web of Data, 2011-2016

Laura Hollink, Markus Luczak-Roesch, Bettina Berendt, et al.

http://usewod.org/
USEWOD2011
2016
Linked Open Data query log analysis?
Licensing + Anonymization:
replace all IPs with a
country code and an
identifier
What has been found so far?
• Efficient index generation [1]

• Caching [2]

• Auto-completion [3]

• Hardware scaling at peak times [4]

• modularisation of data [4]
[1] Arias, M., Fernández, J. D., Martínez-Prieto, M. A., & de
la Fuente, P. (2011). An empirical study of real-world
SPARQL queries. USEWOD2011
[2] Lorey, J., & Naumann, F. Caching and prefetching
strategies for sparql queries. USEWOD2013
[3] K. Kramer,R.Q. Dividino, and G. Gröner. SPACE:
SPARQL Index for Efficient Autocompletion. ISWC2013
(Posters & Demos)
[4] Luczak-Rösch, M., & Bischoff, M. (2011). Statistical
analysis of web of data usage. EvoDyn2011
[5] Rietveld, L., & Hoekstra, R. Man vs. Machine:
Differences in SPARQL Queries. USEWOD2014
[6] Huelss, J., & Paulheim, H. What SPARQL Query Logs
Tell and do not Tell about Semantic Relatedness in LOD.
NoISE @ ESWC 2015
Issues:
• what is the difference between queries by machines and
humans? [5]

• what is the meaning of repeated queries by tools? Bots?

• a lot of the usage is invisible due to data dump
download
[6]
Reflection: to what extend can we now answer
these questions?
How did the debate about the
financial crisis in Greece
develop?

Which political event has
attracted most media
attention?

What are the differences
between different media?

Has the coverage changed
over time?
Reflection: to what extend can we now answer
these questions?
How did the debate about the
financial crisis in Greece
develop?

Which political event has
attracted most media
attention?

What are the differences
between different media?

Has the coverage changed
over time?
Yes, but:

• what is the influence of the selection of newspapers
available at the National Library?

• what was the quality of the digitisation process (OCR)?

• How good is our linking approach (based on
automatically detected entities and topics)?

➡ How to handle these uncertainties is one of our research
questions! We call this Tool Criticism
Resources:
PoliMedia demo: http://polimedia.nl/
PoliMedia project video: https://youtu.be/u24oRCj7xrQ
Talk of Europe project: http://talkofeurope.eu/
Talk of Europe data: purl.org/linkedpolitics
Talk of Europe project video: https://youtu.be/GxA53gkCe0o
USEWOD workshop: http://usewod.org/
My website: http://homepages.cwi.nl/~hollink/
I’d be happy to answer your questions!

More Related Content

What's hot (20)

PDF
Clark - Metadata is the Message
National Information Standards Organization (NISO)
 
PDF
Guest Lecture: Linked Open Data for the Humanities and Social Sciences
Laura Hollink
 
PPT
Linked Open Data for Libraries
Lukas Koster
 
PPTX
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
Armin Haller
 
PPTX
Usage of Linked Data: Introduction and Application Scenarios
EUCLID project
 
PDF
Linked open data for cultural heritage
Athanasios Velios
 
PPT
Linked library data
Jindřich Mynarz
 
PDF
The Web of Data is Our Oyster
Richard Wallis
 
PPT
Linking library data
Jindřich Mynarz
 
PPTX
Linked Data for Libraries: Experiments between Cornell, Harvard and Stanford
Simeon Warner
 
PDF
LD4L OCLC Data Strategy
Richard Wallis
 
PPT
Semantic Search Summer School2009
Peter Mika
 
PPT
Publishing data on the Semantic Web
Peter Mika
 
PDF
Linked data as a library data platform
Jindřich Mynarz
 
PDF
Web Driven Revolution For Library Data
Richard Wallis
 
PPT
Year of the Monkey: Lessons from the first year of SearchMonkey
Peter Mika
 
PPTX
It19 20140721 linked data personal perspective
Janifer Gatenby
 
PPTX
Linked Data Implementations—Who, What and Why?
OCLC
 
PPT
Introduction To RDF and RDFS
Nilesh Wagmare
 
PDF
Identifying The Benefit of Linked Data
Richard Wallis
 
Guest Lecture: Linked Open Data for the Humanities and Social Sciences
Laura Hollink
 
Linked Open Data for Libraries
Lukas Koster
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
Armin Haller
 
Usage of Linked Data: Introduction and Application Scenarios
EUCLID project
 
Linked open data for cultural heritage
Athanasios Velios
 
Linked library data
Jindřich Mynarz
 
The Web of Data is Our Oyster
Richard Wallis
 
Linking library data
Jindřich Mynarz
 
Linked Data for Libraries: Experiments between Cornell, Harvard and Stanford
Simeon Warner
 
LD4L OCLC Data Strategy
Richard Wallis
 
Semantic Search Summer School2009
Peter Mika
 
Publishing data on the Semantic Web
Peter Mika
 
Linked data as a library data platform
Jindřich Mynarz
 
Web Driven Revolution For Library Data
Richard Wallis
 
Year of the Monkey: Lessons from the first year of SearchMonkey
Peter Mika
 
It19 20140721 linked data personal perspective
Janifer Gatenby
 
Linked Data Implementations—Who, What and Why?
OCLC
 
Introduction To RDF and RDFS
Nilesh Wagmare
 
Identifying The Benefit of Linked Data
Richard Wallis
 

Similar to Linked Open Data (20)

PDF
Linked Data
Anja Jentzsch
 
PPTX
Hack U Barcelona 2011
Peter Mika
 
PDF
Introduction to linked data
Laura Po
 
PPTX
Linked data HHS 2015
Cason Snow
 
PPTX
Linked Data MLA 2015
Cason Snow
 
PPTX
Linked data MLA 2015
Cason Snow
 
ODP
Linked Data
Danny Ayers
 
PPTX
Linked Data Tutorial (Florianópolis)
Oscar Corcho
 
PPSX
The Web of data and web data commons
Jesse Wang
 
PPTX
TPDL2013 tutorial linked data for digital libraries 2013-10-22
jodischneider
 
PDF
Linked Data (1st Linked Data Meetup Malmö)
Anja Jentzsch
 
PDF
Semantic Web Technology
Rathachai Chawuthai
 
PPTX
Madrid Building blocks of Linked Data
Victor de Boer
 
PDF
Open data and linked data
Marie Gustafsson Friberger
 
PDF
Methodological Guidelines for Publishing Linked Data
Boris Villazón-Terrazas
 
PPTX
Linked Open Data Utrecht University Library
Ruben Schalk
 
PDF
Linked Open Data: A simple how-to
nvitucci
 
KEY
Transmission6 - Publishing Linked Data
Bill Roberts
 
PPTX
The Web of Linked Data and its information
Alberto Nogales
 
PPTX
Linked open data project
Faathima Fayaza
 
Linked Data
Anja Jentzsch
 
Hack U Barcelona 2011
Peter Mika
 
Introduction to linked data
Laura Po
 
Linked data HHS 2015
Cason Snow
 
Linked Data MLA 2015
Cason Snow
 
Linked data MLA 2015
Cason Snow
 
Linked Data
Danny Ayers
 
Linked Data Tutorial (Florianópolis)
Oscar Corcho
 
The Web of data and web data commons
Jesse Wang
 
TPDL2013 tutorial linked data for digital libraries 2013-10-22
jodischneider
 
Linked Data (1st Linked Data Meetup Malmö)
Anja Jentzsch
 
Semantic Web Technology
Rathachai Chawuthai
 
Madrid Building blocks of Linked Data
Victor de Boer
 
Open data and linked data
Marie Gustafsson Friberger
 
Methodological Guidelines for Publishing Linked Data
Boris Villazón-Terrazas
 
Linked Open Data Utrecht University Library
Ruben Schalk
 
Linked Open Data: A simple how-to
nvitucci
 
Transmission6 - Publishing Linked Data
Bill Roberts
 
The Web of Linked Data and its information
Alberto Nogales
 
Linked open data project
Faathima Fayaza
 
Ad

More from Laura Hollink (10)

PDF
Creating and Analysing Linked Open Data for the EU Parliament
Laura Hollink
 
PDF
Enriching Linked Open Data with distributional semantics to study concept drift
Laura Hollink
 
PDF
Images in Online News: demo scenario
Laura Hollink
 
PDF
Connecting political data to media data
Laura Hollink
 
PDF
Talk of Europe: Linked data of the European Parliament
Laura Hollink
 
PDF
Presentation at the final meeting of the MuNCH project
Laura Hollink
 
PDF
Talk of Europe @ DHBenelux2015
Laura Hollink
 
PDF
Connecting political data to media data
Laura Hollink
 
PDF
WWW2013: Web Usage Mining with Semantic Analysis
Laura Hollink
 
PDF
Bringing parliamentary debates to the Semantic Web
Laura Hollink
 
Creating and Analysing Linked Open Data for the EU Parliament
Laura Hollink
 
Enriching Linked Open Data with distributional semantics to study concept drift
Laura Hollink
 
Images in Online News: demo scenario
Laura Hollink
 
Connecting political data to media data
Laura Hollink
 
Talk of Europe: Linked data of the European Parliament
Laura Hollink
 
Presentation at the final meeting of the MuNCH project
Laura Hollink
 
Talk of Europe @ DHBenelux2015
Laura Hollink
 
Connecting political data to media data
Laura Hollink
 
WWW2013: Web Usage Mining with Semantic Analysis
Laura Hollink
 
Bringing parliamentary debates to the Semantic Web
Laura Hollink
 
Ad

Recently uploaded (20)

PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
Python basic programing language for automation
DanialHabibi2
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Python basic programing language for automation
DanialHabibi2
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 

Linked Open Data

  • 1. Linked Open Data SIKS course on Data Science May 20, 2016 Vught. Laura Hollink
  • 2. Why do we create and use Linked Open Data? Example questions from the humanities and social sciences How did the debate about the financial crisis in Greece develop?
  • 3. Searching the proceedings of the EU Parliament "Greece" in the plenary meetings of the European Parliament Year Nr.ofmentions 050100150200 1999 2000 2001 2001 2002 2003 2004 2005 2006 2006 2007 2008 2009 2010 2010 2011 2012 2013
  • 4. Searching through newspaper archives Mentions of “Griekenland” in the Dutch newspaper the Telegraaf.
  • 5. Search volumes on a search engine Query = “Greece” http://www.google.com/trends
  • 6. Search volumes on a search engine Query = “Greece” http://www.google.com/trends We need access to data. Analysing them gives us some useful insight. But to answer the question properly we would need to combine sources and do more complex queries.
  • 7. Why do we create and use Linked Open Data? Example question 2 Which political debate in the post-war period has attracted most media attention?
  • 9. “De Indonesische Quaestie" To answer this question we need to go through all newspaper articles about all political debates. -> we need access to combined data sources, we need structured queries.
  • 10. Why do we create and use Linked Open Data?
  • 11. Why do we create and use Linked Open Data? Example question 3 What are the differences between different media? Example question 4 Has the coverage changed over time?
  • 12. Research goals and research questions Our goal is to build an infrastructure to answer these kinds of questions. 1. How do we automatically link heterogeneous datasets? 2. How do we interpret links between datasets of different quality and certainty? 3. What can we conclude from usage statistics on these datasets? 4. Can we design interfaces that allow scholars to study the datasets • including the links between them? • while assessing the reliability of the findings?
  • 13. Research goals and research questions Our goal is to build an infrastructure to answer these kinds of questions. 1. How do we automatically link heterogeneous datasets? 2. How do we interpret links between datasets of different quality and certainty? 3. What can we conclude from usage statistics on these datasets? 4. Can we design interfaces that allow scholars to study the datasets • including the links between them? • while assessing the reliability of the findings? Data Science - Big Data - Linked Open Data
  • 14. Table of Contents 1. What is Linked Open Data (LOD) 2. Creating LOD 1. How to discover links 2. How to represent links on the Web 3. How to evaluate links 3. Access to LOD (from both the server and the client perspective)
  • 15. What is Linked Open Data?
  • 16. What is Linked Open Data?
  • 17. What is Linked Open Data? A method of publishing structured data on the Web in such a way that it can be linked and queried by computers as well as humans.
  • 18. The Web of Documents
  • 19. The Web of Documents • Documents  identified  by  URIs  (html,  pdf,  images,  movies,  etc.)   • with  structured  information  for  humans  (tables,  headers)  and   • with  hyperlinks  between  them   • The  data  is  not  machine  readable,  meant  for  humans   • structure  is  implicit  (what  do  the  columns  of  a  table  mean?)   • links  are  not  typed  (what  is  the  relation  between  two  documents?)  
  • 20. The Web of Data
  • 21. The Web of Data • Everything  identified  by  URIs  (not  just  documents,  but  also  classes,   instances,  relations/links)   • The  data  is  machine  readable:     • in  formal  languages  (RDF,  RDFS,  OWL,  SKOS)     • which  enable  machines  to  do  reasoning,  i.e.  infer  new  statements   from  inserted  statements.  
  • 22. Compared to a database table… Amsterdam has population “1364422” City Schiphol is a has airport
  • 23. Thing Type Population Airport Amsterdam City 1364422 Schiphol …. … …. … Compared to a database table… Amsterdam has population “1364422” City Schiphol is a has airport
  • 24. Differences: • Statements can be distributed over the web • Non-unique naming assumption • Open World assumption • Everyone can say anything about anything Thing Type Population Airport Amsterdam City 1364422 Schiphol …. … …. … Compared to a database table… Amsterdam has population “1364422” City Schiphol is a has airport
  • 25. Examples of URIs on the Web of Data • documents: • http://vu.nl/index.html • http://example.org/cities#Leuven • real world objects (a book in the library, a person) • isbn://5031-4444-333 • http://eyaloren.org/foaf.rdf#me • concepts: • http://cyc.org/concept/Mammal • http://cyc.org/concept/Dog • www.w3.org/2006/03/wn/wn20/instances/synset-anniversary-noun-1 • relations: • http://purl.org/linkedpolitics/vocabulary/speaker
  • 26. RDF (the basics) • A W3C recommendation to describe resources on the Web of Data called “Resource description Framework” • See https://www.w3.org/RDF/ • RDF data model: triples!
  • 27. RDF (the basics) • A W3C recommendation to describe resources on the Web of Data called “Resource description Framework” • See https://www.w3.org/RDF/ • RDF data model: triples!
  • 28. RDF (the basics) • A W3C recommendation to describe resources on the Web of Data called “Resource description Framework” • See https://www.w3.org/RDF/ • RDF data model: triples! RDF example in Turtle syntax: <bob#me> a foaf:Person ; foaf:knows <alice#me> ; schema:birthDate "1990-07-04"^^xsd:date ; foaf:topic_interest wd:Q12418 .
  • 29. Vocabulary definition and reasoning with RDFS B C r A data level ontology / vocabulary / schema level
  • 30. Vocabulary definition and reasoning with RDFS A B C IF B rdfs:subClassOf A C rdfs:subClassOf B THEN C rdfs:subClassOf A B C r A data level ontology / vocabulary / schema level
  • 31. Vocabulary definition and reasoning with RDFS A B A B C IF B rdfs:subClassOf A C rdfs:subClassOf B THEN C rdfs:subClassOf A IF C rdfs:subClassOf B r rdf:type C THEN r rdf:type B B C r A data level ontology / vocabulary / schema level
  • 32. Vocabulary definition and reasoning with RDFS A B A B C IF B rdfs:subClassOf A C rdfs:subClassOf B THEN C rdfs:subClassOf A IF B rdfs:subClassOf A r rdf:type B THEN r rdf:type A <bob#me> rdf:type foaf:Person . foaf:Person rdfs:subClassOf foaf:Agent .
  • 33. Vocabulary definition and reasoning with RDFS A B A B C IF B rdfs:subClassOf A C rdfs:subClassOf B THEN C rdfs:subClassOf A IF B rdfs:subClassOf A r rdf:type B THEN r rdf:type A <bob#me> rdf:type foaf:Person . foaf:Person rdfs:subClassOf foaf:Agent . <bob#me> a foaf:Agent .
  • 34. Vocabulary definition and reasoning with RDFS A B A B C IF B rdfs:subClassOf A C rdfs:subClassOf B THEN C rdfs:subClassOf A IF B rdfs:subClassOf A r rdf:type B THEN r rdf:type A <bob#me> rdf:type foaf:Person . foaf:Person rdfs:subClassOf foaf:Agent . <bob#me> a foaf:Agent . Standard meaning
  • 35. Vocabulary definition and reasoning with RDFS IF p rdfs:range R A p B THEN B rdf:type R <bob#me> foaf:knows <alice#me> . foaf:knows rdfs:range foaf:Person .
  • 36. Vocabulary definition and reasoning with RDFS IF p rdfs:range R A p B THEN B rdf:type R <bob#me> foaf:knows <alice#me> . foaf:knows rdfs:range foaf:Person . <alice#me> rdf:type foaf:Person .
  • 37. Vocabulary definition and reasoning with RDFS IF p rdfs:range R A p B THEN B rdf:type R <bob#me> foaf:knows <alice#me> . foaf:knows rdfs:range foaf:Person . <alice#me> rdf:type foaf:Person . Standard meaning
  • 38. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/
  • 39. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant .
  • 40. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant .
  • 41. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant . Query: :JamesDean :playedIn ?what .
  • 42. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant . Query: :JamesDean :playedIn ?what . Answer: :Giant
  • 43. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant . Query: :JamesDean :playedIn ?what . Answer: :Giant
  • 44. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant . Query: :JamesDean :playedIn ?what . Answer: :Giant Query: ?who :playedIn :Giant.
  • 45. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant . Query: :JamesDean :playedIn ?what . Answer: :Giant Query: ?who :playedIn :Giant. Answer: :JamesDean
  • 46. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant . Query: :JamesDean :playedIn ?what . Answer: :Giant Query: ?who :playedIn :Giant. Answer: :JamesDean
  • 47. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant . Query: :JamesDean :playedIn ?what . Answer: :Giant Query: ?who :playedIn :Giant. Answer: :JamesDean Query: :JamesDean ?what :Giant.
  • 48. SPARQL (the basics) • A W3C recommendation for querying RDF graphs called “SPARQL Protocol And RDF Query Language” • See http://www.w3.org/TR/rdf-sparql-query/ or http://www.w3.org/TR/ sparql11-query/ Data: :JamesDean :playedIn :Giant . Query: :JamesDean :playedIn ?what . Answer: :Giant Query: ?who :playedIn :Giant. Answer: :JamesDean Query: :JamesDean ?what :Giant. Answer: :playedIn
  • 49. Linked Open Data A method of publishing on the Web of Data: openly available, in RDF, with links to other datasets. Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
  • 50. Linked Open Data A method of publishing on the Web of Data: openly available, in RDF, with links to other datasets. Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
  • 51. Creating Linked Open Data in the Talk of Europe project: Discovering links, knowledge representation
  • 52. Creating Linked Open Data in the Talk of Europe project: Discovering links, knowledge representation
  • 53. The European Parliament as Linked Open Data Laura Hollink Centrum Wiskunde & Informatica, Amsterdam Astrid van Aggelen VU University Amsterdam Martijn Kleppe Erasmus University Rotterdam Henri Beunders Erasmus University Rotterdam Jill Briggeman Erasmus University Rotterdam Max Kemman University of Luxembourg
  • 54. Talk of Europe goals • To publish the entire plenary debates of the European Parliament as Linked Open Data • To improve access to the data • To enable large scale analysis across time spans. ‣To residents of the European Union access to the proceedings of the European parliament is a formal right. A. van Aggelen, L. Hollink, M. Kemman, M. Kleppe & H. Beunders. The debates of the European Parliament as Linked Open Data. Semantic Web Journal. In press, 2016.
  • 55. 1. Data in RDF
  • 56. 1. Data in RDF
  • 57. 1. Data in RDF 14M RDF statements about the 30K speeches in 23 languages by 3K speakers in 1K session days that were held in the EU parliament between 1999 and 2014
  • 58. 2. Links to external datasets •
  • 59. 2. Links to external datasets •
  • 60. 2. Links to external datasets •
  • 61. Example 1: speeches that contain a certain keyword Query: all speeches that contain the phrase “open data” …. So let us go for open data, let us go for utilisation of all the instruments available to that end! ….. …. but there too governments are encouraging the use of open data to increase transparency, accountability and citizen participation …. …. We already have many open data projects in the Member States and local authorities…..
  • 62. Example 2: speeches that contain a certain keyword by date "Slovenia" in the plenary meetings of the European Parliament Year Nr.ofmentions 020406080100 1999 2000 2001 2003 2004 2005 2006 2007 2008 2010 2011 2012 2013
  • 63. Example 2: speeches that contain a certain keyword by date "Slovenia" in the plenary meetings of the European Parliament Year Nr.ofmentions 020406080100 1999 2000 2001 2003 2004 2005 2006 2007 2008 2010 2011 2012 2013
  • 64. Example 2: speeches that contain a certain keyword by date Mentions of 'human rights' dates Frequency 0200400600800 1999 2000 2001 2003 2004 2005 2006 2007 2009 2010 2011 2012 2013
  • 65. Example 3: speeches that contain a certain keyword by country AT BE BG CY CZ DE DK EE ES FI FR GB GR HR HU IE IT LT LU LV MT NL PL PT RO SE SI SK Mentions of 'human rights' by country 01000200030004000500060007000
  • 66. Example 4: the number of speeches per EU country SELECT ?c (COUNT(?c) as ?count) WHERE { ?x rdf:type <http://purl.org/linkedpolitics/vocabulary/eu/plenary/Speech>. ?x <http://purl.org/linkedpolitics/vocabulary#speaker> ?p. ?p <http://purl.org/linkedpolitics/vocabulary#countryOfRepresentation> ?c } GROUP BY ?c LIMIT 50
  • 67. Example 5: background info about the MEPs • MEPs that were not born in Europe. Members of Parliament
  • 68. Example 5: background info about the MEPs • MEPs that were not born in Europe. Members of Parliament
  • 69. Example 5: background info about the MEPs • MEPs that were not born in Europe. Members of Parliament
  • 70. Example 5: background info about the MEPs • MEPs that were not born in Europe. Members of Parliament
  • 71. Example 5: background info about the MEPs • MEPs that were not born in Europe. Members of Parliament
  • 72. Example 5: background info about the MEPs • MEPs that were not born in Europe. Members of Parliament Integrate data from the EU parliament with external datasets
  • 73. Linking Members of Parliament to Wikipedia / DBpedia
  • 74. Linking Members of Parliament to Wikipedia / DBpedia
  • 75. Linking Members of Parliament to Wikipedia / DBpedia
  • 76. Linking Members of Parliament to Wikipedia / DBpedia • String matching is the most important feature in the linking process. • “nearly all [alignment systems] use a string similarity metric” [12] • stopping and stemming is not helpful! Nor is using WordNet synonyms. [12] [12] Cheatham, M., & Hitzler, P. String similarity metrics for ontology alignment. ISWC 2013. http://www.dbpedia.org/resource/Judith_Sargentini
  • 77. Linking Members of Parliament to Wikipedia / DBpedia • String matching is the most important feature in the linking process. • “nearly all [alignment systems] use a string similarity metric” [12] • stopping and stemming is not helpful! Nor is using WordNet synonyms. [12] [12] Cheatham, M., & Hitzler, P. String similarity metrics for ontology alignment. ISWC 2013. http://www.dbpedia.org/resource/Judith_Sargentini
  • 78. How to relate a speech to a speaker and party? lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:speaker lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:speaker lp:EUParty/SomeParty lpv:hasParty
  • 79. How to relate a speech to a speaker and party? Why is this not a good solution? lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:speaker lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:speaker lp:EUParty/SomeParty lpv:hasParty
  • 80. How to relate a speech to a speaker and party? Why is this not a good solution? 1. A person might be a member of more than one party (at different times) lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:speaker lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:speaker lp:EUParty/SomeParty lpv:hasParty
  • 81. How to relate a speech to a speaker and party? Why is this not a good solution? 1. A person might be a member of more than one party (at different times) 2. Since there is no link between a speech and a party, queries for all speeches spoken by the members of a certain party become very complicated. lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:speaker lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:speaker lp:EUParty/SomeParty lpv:hasParty
  • 82. How to relate a speech to a speaker and party? "20111126"^ xsd:date "20090716"^ xsd:date lp:political- Function102 lpv:beginning lpv:end lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:role lp:EUCommittee/ Committee_on_Legal_Affairs lp:Role/substitute lpv:political Function lpv:institution lpv:speaker
  • 83. How to relate a speech to a speaker and party? "20111126"^ xsd:date "20090716"^ xsd:date lp:political- Function102 lpv:beginning lpv:end lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:role lp:EUCommittee/ Committee_on_Legal_Affairs lp:Role/substitute lpv:political Function lpv:institution lpv:speaker "20111126"^ xsd:date lp:political- Function101 lpv:end "20111126"^ xsd:date lpv:beginning "20071114" ^xsd:date lpv:PoliticalFunction "20090716"^ xsd:date lp:political- Function102 lpv:beginning lpv:end lp:EUmember_1023 lp:political Function lp:eu/plenary/2009-10-21/Speech_140> lpv:role lp:EUCommittee/ Committee_on_Legal_Affairs lp:Role/substitutelp:Role/member lp:EUParty/NI lpv:role lpv:political Function lpv:institutionlpv:institution rdf:type lpv:speaker rdf:type
  • 84. How to relate a speech to a speaker and party? "20111126"^ xsd:date "20090716"^ xsd:date lp:political- Function102 lpv:beginning lpv:end lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:role lp:EUCommittee/ Committee_on_Legal_Affairs lp:Role/substitute lpv:political Function lpv:institution lpv:speaker "20111126"^ xsd:date lp:political- Function101 lpv:end "20111126"^ xsd:date lpv:beginning "20071114" ^xsd:date lpv:PoliticalFunction "20090716"^ xsd:date lp:political- Function102 lpv:beginning lpv:end lp:EUmember_1023 lp:political Function lp:eu/plenary/2009-10-21/Speech_140> lpv:role lp:EUCommittee/ Committee_on_Legal_Affairs lp:Role/substitutelp:Role/member lp:EUParty/NI lpv:role lpv:political Function lpv:institutionlpv:institution rdf:type lpv:speaker rdf:type "20111126"^ xsd:date lp:political- Function101 lpv:end "20111126"^ xsd:date lpv:beginning "20071114" ^xsd:date lpv:PoliticalFunction "20090716"^ xsd:date lp:political- Function102 lpv:beginning lpv:end lp:EUmember_1023 lp:political Function lp:eu/plenary/2009-10-21/Speech_140> lpv:role lp:EUCommittee/ Committee_on_Legal_Affairs lp:Role/substitutelp:Role/member lp:EUParty/NI lpv:role lpv:political Function lpv:institutionlpv:institution rdf:type lpv:spokenAs lpv:speaker lpv:spokenAs rdf:type
  • 85. How to relate a speech to a speaker and party? "20111126"^ xsd:date "20090716"^ xsd:date lp:political- Function102 lpv:beginning lpv:end lp:EUmember_1023lp:eu/plenary/2009-10-21/Speech_140> lpv:role lp:EUCommittee/ Committee_on_Legal_Affairs lp:Role/substitute lpv:political Function lpv:institution lpv:speaker "20111126"^ xsd:date lp:political- Function101 lpv:end "20111126"^ xsd:date lpv:beginning "20071114" ^xsd:date lpv:PoliticalFunction "20090716"^ xsd:date lp:political- Function102 lpv:beginning lpv:end lp:EUmember_1023 lp:political Function lp:eu/plenary/2009-10-21/Speech_140> lpv:role lp:EUCommittee/ Committee_on_Legal_Affairs lp:Role/substitutelp:Role/member lp:EUParty/NI lpv:role lpv:political Function lpv:institutionlpv:institution rdf:type lpv:speaker rdf:type "20111126"^ xsd:date lp:political- Function101 lpv:end "20111126"^ xsd:date lpv:beginning "20071114" ^xsd:date lpv:PoliticalFunction "20090716"^ xsd:date lp:political- Function102 lpv:beginning lpv:end lp:EUmember_1023 lp:political Function lp:eu/plenary/2009-10-21/Speech_140> lpv:role lp:EUCommittee/ Committee_on_Legal_Affairs lp:Role/substitutelp:Role/member lp:EUParty/NI lpv:role lpv:political Function lpv:institutionlpv:institution rdf:type lpv:spokenAs lpv:speaker lpv:spokenAs rdf:type Note: this is a common “design pattern” referred to as n-ary relations or relations as classes
  • 86. Intermezzo: one-question Quiz Reasoning on the Web of Data Question: What can we conclude from this graph? A. Stihler is a member of exactly 3 parties B. Stihler is a member of at least 3 parties C. Stihler is a member of at most 3 parties D. None of the above E. All of the above F. Other, namely …. http://purl.org/linkedpolitics/EUmember_4545 "Catherine Stihler"foaf:name http://purl.org/linkedpolitics/EUParty/PES http://dbpedia.org/resource/ Party_of_European_Socialists http://dbpedia.org/resource/ Progressive_Alliance_of_Socialists_and_Democrats :memberOf :memberOf :memberOf
  • 87. Creating Linked Open Data in the PoliMedia project: Discovering links, knowledge representation, evaluation
  • 88. Creating Linked Open Data in the PoliMedia project: Discovering links, knowledge representation, evaluation
  • 91. Which political debate in the post-war period has attracted most media attention? What are the differences between different media? Has the coverage changed over time?
  • 92. Transcriptions of all 9,294 meetings of the Dutch parliament between 1945-1995, consisting of 1,208,903 speeches. Roughly 1.8 Million news bulletins between 1937-1984 (We only use 1945-1995) Archives of hundreds of newspaper with tons of newspaper issues or 10’s of Millions of articles between 1618-1995. (We only use 1945-1995) Transcriptions of all meetings of the European Parliament between 1999 and 2014.
  • 94. Links in PoliMedia is about • 3 Million links
  • 95. Discovering links between politics and news Detect topics in speeches Create queries Search newspaper archive Topics Named Entities Name of speaker Detect Named Entities in speeches Candidate articles Queries Rank candidate articles Links between speeches and articles Debates Date of debate
  • 96. Step 2: generate links Detect topics in speeches Create queries Search newspaper archive Topics Named Entities Name of speaker Detect Named Entities in speeches Candidate articles Queries Rank candidate articles Links between speeches and articles Debates Date of debate Intuition 1: The name of the speaker should appear in the article and the article should be published within a week of the debate
  • 97. Step 2: generate links Detect topics in speeches Create queries Search newspaper archive Topics Named Entities Name of speaker Detect Named Entities in speeches Candidate articles Queries Rank candidate articles Links between speeches and articles Debates Date of debate Intuition 1: The name of the speaker should appear in the article and the article should be published within a week of the debate Intuition 2: the more the article and the speech overlap in terms of topics and named entities, the more they are related.
  • 99. Representation of links • Note: this is another example of the“design pattern” referred to as n-ary relations or relations as classes! • It allows us to save provenance information about the statements we create. :speech123:newsArticle456 :isAbout
  • 100. Representation of links • Note: this is another example of the“design pattern” referred to as n-ary relations or relations as classes! • It allows us to save provenance information about the statements we create. :speech123:newsArticle456 :isAbout :speech123 :newsArticle456 :link001 01-02-2013 :PoliMedia_Linking_Engine :quotes :concept1 :concept2 link type :madeBy:creationDate
  • 102. Evaluation of links 1. Manually rating (a sample of) links • relatively cheap and easy to interpret • only precision, no recall
  • 103. Evaluation of links 1. Manually rating (a sample of) links • relatively cheap and easy to interpret • only precision, no recall 2. Comparison to a reference linkset • precision and recall • used in OAEI on the SEALS platform • more expensive if a reference alignment has to be created (but: crowd sourcing!)
  • 104. Evaluation of links 1. Manually rating (a sample of) links • relatively cheap and easy to interpret • only precision, no recall 2. Comparison to a reference linkset • precision and recall • used in OAEI on the SEALS platform • more expensive if a reference alignment has to be created (but: crowd sourcing!) 3. End-to-end evaluation (a.k.a. evaluating an application that uses the mappings) • arguably the best method! • need to have access to an application + users
  • 105. Evaluation of links: beyond precision / recall B C r A data level ontology / vocabulary / schema level
  • 106. Evaluation of links: beyond precision / recall Generalized precision and Generalized recall • Instead of a binary classification into correct/ incorrect mappings, take into account how wrong an link is: • where r(a) is the semantic distance between correspondence a and correspondence a’ in the reference alignment, A is the number of correspondences. Laura Hollink, Mark van Assem, Shenghui Wang, Antoine Isaac, Guus Schreiber. Two Variations on Ontology Alignment Evaluation: Methodological Issues.ESWC 2008. B C r A data level ontology / vocabulary / schema level
  • 107. Evaluation of links in PoliMedia How good are the links? • We ask 2 raters to manually score pairs of newspaper articles and speeches. • a pilot study showed that we needed more than a 2 point scale. • inter-rater agreement: 0.5 -> acceptable, but not high. • Precision: 80%
  • 108. Evaluation of links in PoliMedia Setting 1 Setting 2 Setting 3 0,48 0,62 0,8 How good are the links? • We ask 2 raters to manually score pairs of newspaper articles and speeches. • a pilot study showed that we needed more than a 2 point scale. • inter-rater agreement: 0.5 -> acceptable, but not high. • Precision: 80%
  • 109. Evaluation of links in PoliMedia Setting 1 Setting 2 Setting 3 0,48 0,62 0,8 How many links did we miss? • We ask the raters to manually search the KB archives for related articles. • Recall: 62% How good are the links? • We ask 2 raters to manually score pairs of newspaper articles and speeches. • a pilot study showed that we needed more than a 2 point scale. • inter-rater agreement: 0.5 -> acceptable, but not high. • Precision: 80%
  • 110. DEMO - PoliMedia search application
  • 116. Online database: “SPARQL endpoint” • A service to query a knowledge base using the SPARQL query language. “All speeches with more than 60 associated news items.”
  • 117. Access to Linked Open Data: how to serve and how to consume Linked Open Data
  • 118. Access to Linked Open Data: how to serve and how to consume Linked Open Data
  • 119. Access to LOD: 1. download a data dump
  • 120. Access to LOD: 1. download a data dump From server logs we know the query -some context of the requested URIs -variable names (?)
  • 121. Access to LOD 2: follow-your-nose
  • 122. Access to LOD 2: follow-your-nose lp:eu/plenary/ 2013-11-20/AgendaItem_6
  • 123. Access to LOD 2: follow-your-nose lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/ 2013-11-20/Speech_103 "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 dc:hasPart
  • 124. Access to LOD 2: follow-your-nose lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/ 2013-11-20/Speech_103 "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 dc:hasPart lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent dc:hasPart
  • 125. Access to LOD 2: follow-your-nose lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/ 2013-11-20/Speech_103 "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 dc:hasPart lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent dc:hasPart lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent ...Ich glaube, das war ein außergewöhnlicher Moment für uns alle hier in diesem Parlament[...]"@en lpv:spokenText lpv:speaker dc:hasPart lp:Martin_Schulz
  • 126. Access to LOD 2: follow-your-nose lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/ 2013-11-20/Speech_103 "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 dc:hasPart lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent dc:hasPart lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent ...Ich glaube, das war ein außergewöhnlicher Moment für uns alle hier in diesem Parlament[...]"@en lpv:spokenText lpv:speaker dc:hasPart lp:Martin_Schulz lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent ...Ich glaube, das war ein außergewöhnlicher Moment für uns alle hier in diesem Parlament[...]"@en lpv:spokenText lpv:speaker owl:sameAs http:://dbpedia.org/ resource/Martin_Schulz dc:hasPart lp:Martin_Schulz
  • 127. Access to LOD 2: follow-your-nose lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/ 2013-11-20/Speech_103 "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 dc:hasPart lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent dc:hasPart lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent ...Ich glaube, das war ein außergewöhnlicher Moment für uns alle hier in diesem Parlament[...]"@en lpv:spokenText lpv:speaker dc:hasPart lp:Martin_Schulz lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent ...Ich glaube, das war ein außergewöhnlicher Moment für uns alle hier in diesem Parlament[...]"@en lpv:spokenText lpv:speaker owl:sameAs http:://dbpedia.org/ resource/Martin_Schulz dc:hasPart lp:Martin_Schulz lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent ...Ich glaube, das war ein außergewöhnlicher Moment für uns alle hier in diesem Parlament[...]"@en lpv:spokenText lpv:speaker owl:sameAs http:://dbpedia.org/ resource/Martin_Schulz dc:hasPart lp:Martin_Schulz dbp:children "2" lpv:speaker dbc:Officiers_of_the_Légion_d'honneur
  • 128. Access to LOD 2: follow-your-nose lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/ 2013-11-20/Speech_103 "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 dc:hasPart lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent dc:hasPart lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent ...Ich glaube, das war ein außergewöhnlicher Moment für uns alle hier in diesem Parlament[...]"@en lpv:spokenText lpv:speaker dc:hasPart lp:Martin_Schulz lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent ...Ich glaube, das war ein außergewöhnlicher Moment für uns alle hier in diesem Parlament[...]"@en lpv:spokenText lpv:speaker owl:sameAs http:://dbpedia.org/ resource/Martin_Schulz dc:hasPart lp:Martin_Schulz lp:eu/plenary/ 2013-11-20/Speech_103 ...the fittest need to struggle for the survival of the weak.[...]"@en lpv:spokenText lpv:speaker lp:Speaker_Malala_Yousafzai "Award of the Sakharov Prize (formal sitting)."@en dc:title dc:hasPart lp:eu/plenary/ 2013-11-20/AgendaItem_6 lp:eu/plenary/2013-11-20/ Speech_104 lpv:has Subsequent ...Ich glaube, das war ein außergewöhnlicher Moment für uns alle hier in diesem Parlament[...]"@en lpv:spokenText lpv:speaker owl:sameAs http:://dbpedia.org/ resource/Martin_Schulz dc:hasPart lp:Martin_Schulz dbp:children "2" lpv:speaker dbc:Officiers_of_the_Légion_d'honneur From server logs we know the requested URI: GET /Martin_Schulz HTTP/1.0 Accept: application/rdf+xml
  • 129. Count the agenda items in which at least one MEP from France spoke out. Access to LOD: 3. SPARQL SELECT (COUNT (DISTINCT ?ai) as ?count) WHERE { ?ai rdf:type <http://purl.org/linkedpolitics/vocabulary/eu/ plenary/AgendaItem ?ai dcterms:hasPart ?speech. ?speech lpv:speaker ?speaker. ?speaker lpv:countryOfRepresentation ?country. ?country rdfs:label ?label. filter(?label="France"@en) }
  • 131. From server logs we know the query -some context of the requested URIs -variable names (?)
  • 135. Access to LOD: 4. Linked Data Fragments xxx.xxx.xxx.xxx - - [17/Oct/2014:07:43:02 +0000] 
 "GET /2014/en?subject=&predicate=&object=dbpedia%3AAustin HTTP/1.1" 200 1309 "http://fragments.dbpedia.org/2014/en" …
  • 136. Access to LOD: 4. Linked Data Fragments xxx.xxx.xxx.xxx - - [17/Oct/2014:07:43:02 +0000] 
 "GET /2014/en?subject=&predicate=&object=dbpedia%3AAustin HTTP/1.1" 200 1309 "http://fragments.dbpedia.org/2014/en" … From server logs we know the triple patterns that were requested -some context of the requested URIs -variable names (?)
  • 137. What do we know about usage of Linked Open Data?
  • 138. What do we know about usage of Linked Open Data?
  • 139. 1. Yearly datasets of server logs released for research purposes, 2011-2016 Luczak-Roesch, Markus, Aljaloud, Saud, Berendt, Bettina and Hollink, Laura (2016) USEWOD 2016 Research Dataset. doi:10.5258/SOTON/385344 2. Yearly workshops for researchers on Usage Data and the Web of Data, 2011-2016 Laura Hollink, Markus Luczak-Roesch, Bettina Berendt, et al. http://usewod.org/ USEWOD2011 2016 Linked Open Data query log analysis?
  • 140. 1. Yearly datasets of server logs released for research purposes, 2011-2016 Luczak-Roesch, Markus, Aljaloud, Saud, Berendt, Bettina and Hollink, Laura (2016) USEWOD 2016 Research Dataset. doi:10.5258/SOTON/385344 2. Yearly workshops for researchers on Usage Data and the Web of Data, 2011-2016 Laura Hollink, Markus Luczak-Roesch, Bettina Berendt, et al. http://usewod.org/ USEWOD2011 2016 Linked Open Data query log analysis? Licensing + Anonymization: replace all IPs with a country code and an identifier
  • 141. What has been found so far? • Efficient index generation [1] • Caching [2] • Auto-completion [3] • Hardware scaling at peak times [4] • modularisation of data [4] [1] Arias, M., Fernández, J. D., Martínez-Prieto, M. A., & de la Fuente, P. (2011). An empirical study of real-world SPARQL queries. USEWOD2011 [2] Lorey, J., & Naumann, F. Caching and prefetching strategies for sparql queries. USEWOD2013 [3] K. Kramer,R.Q. Dividino, and G. Gröner. SPACE: SPARQL Index for Efficient Autocompletion. ISWC2013 (Posters & Demos) [4] Luczak-Rösch, M., & Bischoff, M. (2011). Statistical analysis of web of data usage. EvoDyn2011 [5] Rietveld, L., & Hoekstra, R. Man vs. Machine: Differences in SPARQL Queries. USEWOD2014 [6] Huelss, J., & Paulheim, H. What SPARQL Query Logs Tell and do not Tell about Semantic Relatedness in LOD. NoISE @ ESWC 2015 Issues: • what is the difference between queries by machines and humans? [5] • what is the meaning of repeated queries by tools? Bots? • a lot of the usage is invisible due to data dump download [6]
  • 142. Reflection: to what extend can we now answer these questions? How did the debate about the financial crisis in Greece develop? Which political event has attracted most media attention? What are the differences between different media? Has the coverage changed over time?
  • 143. Reflection: to what extend can we now answer these questions? How did the debate about the financial crisis in Greece develop? Which political event has attracted most media attention? What are the differences between different media? Has the coverage changed over time? Yes, but: • what is the influence of the selection of newspapers available at the National Library? • what was the quality of the digitisation process (OCR)? • How good is our linking approach (based on automatically detected entities and topics)? ➡ How to handle these uncertainties is one of our research questions! We call this Tool Criticism
  • 144. Resources: PoliMedia demo: http://polimedia.nl/ PoliMedia project video: https://youtu.be/u24oRCj7xrQ Talk of Europe project: http://talkofeurope.eu/ Talk of Europe data: purl.org/linkedpolitics Talk of Europe project video: https://youtu.be/GxA53gkCe0o USEWOD workshop: http://usewod.org/ My website: http://homepages.cwi.nl/~hollink/ I’d be happy to answer your questions!