SlideShare a Scribd company logo
Introduction to Linked Data
Laura Po - Exploration, Visualization and Querying of Linked Open Data sources
2nd Keystone Training School - Keyword Search in Big Linked Data, University of Santiago de Compostela (USC), Spain.
Laura Po
Introduction to linked data
Objectives
By the end of this module you should have an understanding of
• What is linked data
• What is open data
• What is the difference between linked and open data
• How to publish linked data (5-star schema)
• What are the linked data principles and the linked data technologies
(the semantic web stack)
• The economic and social impact of linked data
Introduction to linked data
The Web of Data
The evolution from a Web of linked documents to a web of linked data
The Web as a huge decentralized database (knowledge base) of machine-
accessible data
Web of documents... Web of linked data...
The evolution of the web
• The Web started as a collection of documents
published online – accessible at Web location
identified by a URL.
• These documents often contain data about real-
world resources which is mainly human-readable
and cannot be understood by machines.
• The Web of Data is about enabling the access to
this data, by making it available in machine-
readable formats and connecting it using Uniform
Resource Identifiers (URIs), thus enabling people
and machines to collect the data, and put it
together to do all kinds of things with it (permitted
by the licence).
Machine-readable data (or
metadata) is data in a format that
can be interpreted by a computer.
2 types of machine-readable
data:
• human-readable data that is
marked upso that it can also
be understood by computers,
e.g. microformats, RDFa;
• data formats intended
principally for computers, e.g.
RDF, XML and JSON.
Linked Data and the ‘Web of Data‘
● Term refers to an idea originally from Tim Berners-Lee
(Tim Berners-Lee, Linked Data, 2006, http://www.w3.org/DesignIssues/LinkedData.html)
● Set of best practices for publication and linking of
structured data on the web
● Basic assumption: The value of data on the web increases
when they are connected to other data sources
M.Hausenblas, Quick Linked Data Introduction, http://www.slideshare.
net/mediasemanticweb/quick-linked-data-introduction
The Semantic Web isn't just
about putting data on the
web. It is about making
links, so that a person or
machine can explore the web
of data. With linked data,
when you have some of it,
you can find other, related,
data.
Defining linked data
“Linked data is a set of design principles for sharing
machine-readable data on the Web for use by public
administrations, business and citizens.”
EC ISA Case Study: How Linked Data is transforming eGovernment
Linked Data Principles
1. Use URIs as names for things.
2. Use HTTP URIs, so that people can look up those names.
3. When someone looks up a URI, provide useful information,
using the standards (RDF, SPARQL)
4. Include links to other URIs, so that they can discover more
things.
Introduction to linked data
How to get Data from the Web?
● Data can only be found on the Web, if it is available at some website
JDBC
Browser
Web Server
Database
HTTP
How to get Data from the Web?
● There is a number of different (proprietary) Web APIs, data exchange
formats and Mashups on top of that
Database 1 Database 2 Database 3 Database 4
Web
API 1
Web
API 2
Web
API 3
Web
API 4
Mashup
In the Web today...
● Data is locked up in small data islands
● Other applications usually cannot access this data...
Database
Database
Database
Database
Database
Database
Database
Database
Database
Database
Semantic Web Technologies , Dr. Harald Sack, Hasshttp://www.w3.org/2009/Talks/0204-ted-tbl/#(22)
How to get rid of Closed Data Islands?
Database 1 Database 2 Database 3 Database 4
● Apply Semantic Web technologies
○ to publish (structured) data on the web
○ to draw connections from one data source to data from other data sources
RDF data RDF data RDF data RDF data
Linked Data Principles (1/4)
1. Use URIs as names for things.
○ URIs do not only identify documents but also arbitrary objects
of the real world as well as abstract concepts
https://viaf.org/viaf/32197206/
http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart
http://musicbrainz.org/artist/20244d07-534f-4eff-b4d4-930878889970
http://www.imdb.com/title/tt3659388
Linked Data Principles (2/4)
2. Use HTTP URIs, so that people can look up those names.
○ HTTP URIs (URLs) as globally unique names enable
dereferencing of associated information in the Web
○ via http Content Negotiation machine and humans can
access the resource identified by the URI
RDF
Document
URI represents Designatum
http://dbpedia.org/resource/
Wolfgang_Amadeus_Mozart
http://dbpedia.org/page/
Wolfgang_Amadeus_Mozart
http://dbpedia.org/data/
Wolfgang_Amadeus_Mozart
URI represents Designator URI represents Designator
HTML
Document
FOR
MACHINE
FOR
HUMANS
Dereferencable
Every term in a LOD source
must be accessible via its URI
through an HTTP GET. Once
we access the URI we found
the definition of the term.
Linked Data Principles (3/4)
3. When someone looks up a URI, provide useful information, using the
standards (RDF, SPARQL)
○ RDF as universal data model for publishing structured data on the Web
○ Make all URIs in the RDF graph dereferenceable
○ Avoid RDF constructs that cause problems in Linked Data context
■ RDF Reification
■ RDF Collections und Containers
■ unnamed Blank Nodes
Linked Data Principles (4/4)
4. Include links to other URIs, so that they can discover more things.
○ Link RDF references among data between different data sources:
○ owl:sameAs –create a link between individuals
○ rdfs:seeAlso – states that a resource may provide additional information
○ Relationship Links
Links to external LOD Entitites related with the original entity
○ Identity Links
Links to external LOD Entities referring to the same object or concept
○ Vocabulary Links
Links to definitions of the original entity
Advantages of Linked Open Data vs. APIs
○ Simple and generic API for various heterogeneous data sources
enables simple reuse and data sharing among applications
○ RDF Data model guarantees (simple) extensibility
○ Transport via http, standard Port 80, prevents firewall adaption
○ Ontologies enable meaningful connections between data sources
○ Reasoning over Linked Data enables to generate new knowledge,
i.e. inference from implicit to explicit knowledge
Introduction to linked data
Introduction to linked data
The Semantic Web Technology Stack
http://dbpedia.org/resource/
Santiago_de_Compostela
Santiago de Compostela
URI - Uniform Resource Identifier
From Wikipedia to DBpedia
https://en.wikipedia.org/wiki/
Santiago_de_Compostela
http://dbpedia.org/resource/Santiago_de_Compostela
From Wikipedia to DBpedia
http://dbpedia.org/resource/Santiago_de_Compostela
RDF Resource Description Framework
:Santiago_de_Compostela rdf:type dbo:City .
:Santiago_de_Compostela dbo:country dbr:Spain .
:Santiago_de_Compostela owl:sameAs
geodata:Santiago di Compostela .
dbr:University_of_Santiago_de_Compostela
dbp:city dbr:Santiago_de_Compostela .
:Santiago_de_Compostela dbp:populationTotal
95671 (xsd:integer) .
...
:Santiago rdf:type dbo:City .
RDF Subject RDF Property RDF Object
RDF Triple
From Wikipedia to DBpedia
http://dbpedia.org/resource/Santiago_de_Compostela
● Resource
○ can be everything
○ must be uniquely identified and referencable via URI
● Description
○ = description of resources
○ via representing properties and relationships among resources as graphs
● Framework
○ = combination of web based protocolls (URI, HTTP, XML, Turtle, JSON, …)
○ based on formal model (semantics)
● Knowledge in RDF is expressed as a list of statements
● all RDF statements follow the same simple schema (= RDF Triple)
Resource Description Framework
Resource Description Framework
● RDF Statements (RDF-Triple):
+ Object / ValueSubject + Property
URI URI URI / Literal RDF Building Blocks
<http://dbpedia.org/resource
/Santiago_de_Compostela>
<http://dbpedia.org/ontology/
populationTotal>
N-Triples Serialization
“95671” .
graph
representation
<http://dbpedia.org/resource
/Santiago_de_Compostela> <http://dbpedia.org/ontology/
populationTotal>
“95671” .
Resource Description Framework
● URIs and Literals
○ URIs reference resources uniquely
○ Literals describe data values that don’t have a separate existence
<http://dbpedia.org/resource/Spain>
<http://dbpedia.org/ontology
/country>
<http://dbpedia.org/resource
/Santiago_de_Compostela>
<http://dbpedia.org/ontology
/populationTotal>
“95671” .
RDF Schema
dbo:City rdf:type owl:class .
dbo:City rdfs:subClassOf
dbo:Settlement .
dbo:foundationPlace rdfs:range
dbo:City.
...
City foundation
Place
Settlement
rdfs:isSubclassOf
The Semantic Web Technology Stack
http://dbpedia.org/ontology/City
rdfs:range
logical constraint
City
Spain Madrid
dbo:country
Small_town ∩ Capital = ∅
rdf:type
rdfs:isSubclassOf
∀x. ( City(x)∧ seatOfGovernment(x) → Capital(x) )
description logics
+ logical rules
classes
entities
The Semantic Web Technology Stack
Look for a l l cities located i n the same area of
Santiago de Compostela (use the property
dbp:subdivisionName)
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX dbr: <http://dbpedia.org/resource/>
SELECT distinct ?area ?city
FROM <http://dbpedia.org/> WHERE{
?area dbp:subdivisionName dbr:Santiago_de_Compostela.
?area dbp:subdivisionName ?city.
}
The Semantic Web Technology Stack
http://dbpedia.org/sparql
http://dbpedia.org/sparql
Look fo r a l l cities located i n
the same area of Santiago de
Compostela (use the
property
dbp:subdivisionName)
Query language designed to use a syntax similar to SQL for retrieving
data from relational databases.
Different query forms:
• SELECT returns variables and their bindings directly.
• CONSTRUCT returns a single RDF graph specified by a graph template.
• ASK test whether or not a query pattern has a solution. Returns yes/no.
• DESCRIBE returns a single RDF graph containing RDF data about resources.
SPARQL – * Protocol and RDF Query Language
SQL versus SPARQL
SQL SPARQL
Based on relations (tables). Based on labelled directed
graphs.
The relations (tables) to be
matched over should be
indicated.
Assumes a default graph.
(The FROM clause populates this
with specific identified
subgraphs).
(Retrieval) queries produce a
relation from a relation.
SPARQL SELECT queries produce a
relation from a graph.
CONSTRUCT queries (considered
later) produce a graph from a
graph.
Introduction to linked data
The application of the Linked Data Principles leads to a ,Web of Data‘
>1014Datasets
>74B RDF Triples
808M Links
as of August 2014
The Development of the Web of Data
May 2007
The Development of the Web of Data
Nov 2007
The Development of the Web of Data
The Development of the Web of Data
July 2009
The Development of the Web of Data
Aug 2014
Linked Open Data
○ Public Linked Data resources in the Web, licensed as Creative Common CC-BY
○ Tim Berners-Lee‘s 5-Star Criteria for Linked Open Data
★★
★★★
Available on the web (whatever format) but with an open licence, to be Open Data
Available as machine-readable structured data
(e.g. excel instead of image scan of a table)
as (2) plus non-proprietary format (e.g. CSV instead of excel)
★★★★★ All the above, plus: link your data to other people’s data to provide context
★★★★ All the above plus: use open standards from W3C
(URI,RDF and SPARQL) to identify things, so that people can point at your stuff
★
Linked Open Data
http://5stardata.info/en/
Introduction to linked data
December 2007
8 principles for the Open Government Data:
Complete
Primary (not aggregate)
Up to date
Accessible
Machine processable
Non-discriminatory
Non-proprietary
No license fees
https://opengovdata.org/
Open data
Data can be published and
be publicly available under
an open licence without
linking to other data
sources.
Linked data
Data can be linked to URIs from
other data sources, using open
standards such as RDF without
being publicly available under an
open licence.
“Open data is data that can be freely used, reused and
redistributed by anyone – subject only, at most, to the
requirement to attribute and sharealike.”
- OpenDefinition.org
Seealso:
Cobden et al., A research agenda for Linked ClosedData
http://ceur-ws.org/Vol-782/CobdenEtAl_COLD2011.pdf
Linked Data vs open Data
• Flexible data integration: LOGD facilitates data integration and enables
the interconnection of previously disparate government datasets.
• Increase in data quality: The increased (re)use of LOGD triggers a growing
demand to improve data quality. Through crowd-sourcing and self-service
mechanisms, errors are progressively corrected.
• New services: The availability of LOGD gives rise to new services offered
by the public and/or private sector.
• Cost reduction: The reuse of LOGD in e-Government applications leads to
considerable cost reductions.
Seealso:
ISA Study on Business Models for LOGD
https://joinup.ec.europa.eu/community/semic/document/study-business-
models-linked-open-government-data-bm4logd
Linked (open) governament data
Key milestones for linked government data
Introduction to linked data
Linked Data - A Guided Tour
● Datasets ordered
by category
http://linkeddatacatalog.dws.informatik.uni-mannheim.de/state/
Government
● 183 datasets
● top 10 highest indegree: reference.data.gov.uk
● 48 proprietary vocabularies used
● c. 21% fully dereferencable
Dereferencable
Every term in a LOD source must be
accessible via its URI through an HTTP
GET. Once we access the URI we found the
definition of the term.
The dereferencability quota of a LOD
source is define as the number of
dereferencable terms divided by all terms
collected into the source.
fully dereferencable LOD source – there
exist a definition for all URIs
partially dereferencable LOD source - for
some terms, but not for all, a definition
could be retrieved
Media
● 22 datasets
● 22 proprietary vocabularies used
● 0% fully dereferencable
● 9% partially dereferencable
User Generated Content
● 48 datasets
● top 10 highest outdegree: semanticweb.org
● 30 proprietary vocabularies used
● 13% fully dereferencable
● 10% partially dereferencable
Linguistics
● no statistics available so far
Bibliographic Data
● 96 datasets
● top 10 highest indegree: data.semanticweb.org
● top 10 highest outdegree: bibsonomy.org
● 58 proprietary vocabularies used
● 21% fully dereferencable
● 7% partially dereferencable
● 83 datasets
● 35 proprietary vocabularies used
● 28% fully dereferencable
● 6% partially dereferencable
Life Sciences
Cross Domain
● 41 datasets
● top 10 highest indegree: dbpedia.org, w3.org,
lexvo.org
● 55 proprietary vocabularies used
● 27% fully dereferencable
● 11% partially dereferencable
Social Networking
● 520 datasets
● top 10 highest indegree: quitter.se, status.net, …
● top 10 highest outdegree: deri.org, harth.org,...
● 128 proprietary vocabularies used
● 16% fully dereferencable
● 6% partially dereferencable
Semantic Web Technologies , Dr. Harald Sack, Hasso Plattner Insti
Geographic
● 21 datasets
● top 10 highest indegree: geonames.org
● 24 proprietary vocabularies used
● 21% fully dereferencable
● 4% partially dereferencable
Linked Data Ontologies
● Ontologies hold the
Linked Data Cloud together
● OWL
owl:sameAs connects identical
individuals
owl:equivalentClass connects
equivalent classes
Linked Data Ontologies
● Ontologies hold the
Linked Data Cloud together
● SKOS
○ „Simple Knowledge Organization System“
○ based on RDF and RDFS
○ applied for definitions and mappings of
vocabularies and ontologies
■ skos:Concept (classes)
■ skos:narrower
■ skos:broader
■ skos:related
■ skos:exactMatch (vacabulary)
■ skos:narrowMatch
■ skos:broadMatch
■ skos:relatedMatch
Linked Data Ontologies
● Ontologies hold the
Linked Data Cloud together
● umbel
○ „Upper Mapping and Binding Exchange
Layer“
○ Subset of OpenCycas RDF Triples based on
SKOS and OWL2
○ Upper Ontology with 28.000 concepts
(skos:Concept)
○ 46.000 Mappings into DBpedia,
geonames, e.a.
(owl:equivalentClass, rdfs:
subClassOf)
○ Links to more than 2 Mio Wikipedia pages
Introduction to linked data
Member State initiatives – some examples
Some examples on supra-national, national, regional and private initiatives in the
area of linked (open) data across Europe.
DE – Bibliotheksverbund Bayern
Linked data from 180 academic libraries in Bavaria, Berlin and Brandenburg.
IT – Agenzia per l’Italia digitiale
Three datasets published as linked data: the Index of Public Administration, the SPC contracts for web
services and conduction systems and the Classifications for the data in Public Administration.
NL – Building and address register
The Dutch Address and Buildings base register published as linked data.
UK – Ordnance Survey
Three OS Open Data products published as linked data: the 1:50 000 Scale Gazetteer, Code-Point Open
and the administrative geography taken from Boundary Line.
UK – Companies House
Publishing basic company details as linked data
using a simple URI for each company in their database.
Seealso:
ISA Study on Business Models for LOGD
https://joinup.ec.europa.eu/community/semic/document/study-business-
models-linked-open-government-data-bm4logd
Linked Government Data & Metadata initiatives
funded by the European Commission
ADMS.
SW
CORE
PUBLIC
SERVICE
VOCABULARY
Linked Government Data Pilots
http://health.testproject.
eu/PPP/
http://maritime.testproject.
eu/CISE/
http://cpsv.testproject.e
u/CPSV/
Non-governmental applications
Conclusion
• Linked data is a set of design principles for sharing machine-readable
data on the Web.
• Linked data and open data are not the same.
• URIs, RDF and SPARQL form the foundational layer for Linked data.
• Linked data offers a number of advantages for:
• Data integration with small impact on legacy systems;
• Enables for semantic interoperability;
• Enables creativity and innovation through context and knowledge- creation.
Group questions
Is there supply and demand for (Linked) Open
Government Data in your country?
What are, in your opinion, the expected benefits
and pitfalls of Linked Data?
Do you know if there are any Linked (Open) Data
initiatives in your country? If so, how many stars
would you give them?
Introduction to linked data
Download the slide from
My research group website
www.dbgroup.unimore.it
On slide share
http://www.slideshare.net/polaura
References
Some of the materials used in these slides have been rearranged from
- Slides of the “Knowledge Engineering with Semantic Web Technologies
2015” course held by Dott. Harald Sack
https://open.hpi.de/courses/semanticweb2015
- Slides of the "Introduction to linked data" of Open Data Support
http://www.slideshare.net/OpenDataSupport/introduction-to-linked-data-
23402165
- Slides of "Usage of Linked Data Introduction and Application Scenarios «
and "Querying Linked Data" by Barry Norton, EUCLID project
Further readings
Linked Open Government Data. Li Ding Qualcomm, Vassilios Peristeras and Michael
Hausenblas.
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6237454
EUCLID - Course 1: Introduction and Application Scenarios http://www.euclid-
project.eu/modules/course1
Linked Open Data: The Essentials. Florian Bauer, Martin Kaltenböck.
http://www.semantic-web.at/LOD-TheEssentials.pdf
Linked Data: Evolving the Web into a Global Data Space. Tom Heath and Christian Bizer.
http://linkeddatabook.com/editions/1.0/
LOD2 FP7 project, http://lod2.eu/
The Open Knowledge Foundation, http://okfn.org/
W3C Semantic Web, http://www.w3.org/standards/semanticweb/ EUCLID,
http://projecteuclid.org/
ISA Programme, http://ec.europa.eu/isa/
W3C LOGD WG, http://www.w3.org/2011/gld/wiki/Main_Page
LOD Around The Clock FP7 project, http://latc-project.eu/
Data.gov.uk, http://data.gov.uk/linked-data
Related projects and initiatives

More Related Content

What's hot (20)

ZIP
Intro to Linked Open Data in Libraries, Archives & Museums
Jon Voss
 
PDF
Tutorial Data Management and workflows
SSSW
 
PPTX
Omitola birmingham cityuniv
Tope Omitola
 
PPT
Linked library data
Jindřich Mynarz
 
PPTX
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
Armin Haller
 
PPTX
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
Robert H. McDonald
 
PDF
Data hv seminar_thadthong_v05_slshr
Thadthong Bhrammanee JJ/Ja-ay
 
PDF
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
National Information Standards Organization (NISO)
 
PDF
Linked data as a library data platform
Jindřich Mynarz
 
PDF
From Structured Data to Linked Open Governmental Data
Dongpo Deng
 
PPTX
The Web of Data: do we actually understand what we built?
Frank van Harmelen
 
PPT
Library Linked Data and the Future of Bibliographic Control
University of Toronto Libraries - Information Technology Services
 
PDF
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)
Rensselaer Polytechnic Institute
 
PDF
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
National Information Standards Organization (NISO)
 
PPTX
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Mathieu d'Aquin
 
PDF
Semantic Web / Linked Data Technologies
Mathieu d'Aquin
 
PPTX
Linked Data at the Open University: From Technical Challenges to Organization...
Mathieu d'Aquin
 
PPTX
Experience from 10 months of University Linked Data
Mathieu d'Aquin
 
PPT
Linking library data
Jindřich Mynarz
 
PPTX
Working with data.open.ac.uk, the Linked Data Platform of the Open University
Mathieu d'Aquin
 
Intro to Linked Open Data in Libraries, Archives & Museums
Jon Voss
 
Tutorial Data Management and workflows
SSSW
 
Omitola birmingham cityuniv
Tope Omitola
 
Linked library data
Jindřich Mynarz
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
Armin Haller
 
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
Robert H. McDonald
 
Data hv seminar_thadthong_v05_slshr
Thadthong Bhrammanee JJ/Ja-ay
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
National Information Standards Organization (NISO)
 
Linked data as a library data platform
Jindřich Mynarz
 
From Structured Data to Linked Open Governmental Data
Dongpo Deng
 
The Web of Data: do we actually understand what we built?
Frank van Harmelen
 
Library Linked Data and the Future of Bibliographic Control
University of Toronto Libraries - Information Technology Services
 
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)
Rensselaer Polytechnic Institute
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
National Information Standards Organization (NISO)
 
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Mathieu d'Aquin
 
Semantic Web / Linked Data Technologies
Mathieu d'Aquin
 
Linked Data at the Open University: From Technical Challenges to Organization...
Mathieu d'Aquin
 
Experience from 10 months of University Linked Data
Mathieu d'Aquin
 
Linking library data
Jindřich Mynarz
 
Working with data.open.ac.uk, the Linked Data Platform of the Open University
Mathieu d'Aquin
 

Viewers also liked (7)

PPTX
Aggregating Multiple Dimensions for Computing Document Relevance
José Ramón Ríos Viqueira
 
PPTX
School intro
José Ramón Ríos Viqueira
 
PPTX
Information Retrieval Evaluation
José Ramón Ríos Viqueira
 
PDF
1st KeyStone Summer School - Hackathon Challenge
Joel Azzopardi
 
PDF
Tai lieu on_tap_hoa_phan_tich
Phi Phi
 
PDF
Curse of Dimensionality and Big Data
Stephane Marchand-Maillet
 
PDF
Introduzione a Linked Open data e Web semantico / Antonella Iacono
libriedocumenti
 
Aggregating Multiple Dimensions for Computing Document Relevance
José Ramón Ríos Viqueira
 
Information Retrieval Evaluation
José Ramón Ríos Viqueira
 
1st KeyStone Summer School - Hackathon Challenge
Joel Azzopardi
 
Tai lieu on_tap_hoa_phan_tich
Phi Phi
 
Curse of Dimensionality and Big Data
Stephane Marchand-Maillet
 
Introduzione a Linked Open data e Web semantico / Antonella Iacono
libriedocumenti
 
Ad

Similar to Introduction to linked data (20)

ODP
Linked Data
Danny Ayers
 
PPTX
Linked open data project
Faathima Fayaza
 
PPTX
Linked Data MLA 2015
Cason Snow
 
PPTX
Linked data MLA 2015
Cason Snow
 
ODP
Linked Data
cyriacsmail
 
PPTX
Linked data HHS 2015
Cason Snow
 
PPTX
Jarrar: Introduction to Linked Data
Mustafa Jarrar
 
PDF
Methodological Guidelines for Publishing Linked Data
Boris Villazón-Terrazas
 
PPTX
Linked Data Tutorial (Florianópolis)
Oscar Corcho
 
PPTX
Madrid Building blocks of Linked Data
Victor de Boer
 
PDF
Jarrar: Linked Data
Mustafa Jarrar
 
PDF
INSPIRE Hackathon Webinar Intro to Linked Data and Semantics
plan4all
 
PDF
Llinked open data training for EU institutions
Open Data Support
 
PPSX
Linked Data to Improve the OER Experience
The Open Education Consortium
 
PPTX
Consuming Linked Data SemTech2010
Juan Sequeda
 
PPT
Future of Web 2.0 & The Semantic Web
is20090
 
PPTX
Get on the Linked Data Web!
Armin Haller
 
PPT
RDFa From Theory to Practice
Adrian Stevenson
 
PDF
Linked data and Semantic Web Applications for Libraries
Vikas Bhushan
 
PPTX
Introduction to Linked Data 1/5
Juan Sequeda
 
Linked Data
Danny Ayers
 
Linked open data project
Faathima Fayaza
 
Linked Data MLA 2015
Cason Snow
 
Linked data MLA 2015
Cason Snow
 
Linked Data
cyriacsmail
 
Linked data HHS 2015
Cason Snow
 
Jarrar: Introduction to Linked Data
Mustafa Jarrar
 
Methodological Guidelines for Publishing Linked Data
Boris Villazón-Terrazas
 
Linked Data Tutorial (Florianópolis)
Oscar Corcho
 
Madrid Building blocks of Linked Data
Victor de Boer
 
Jarrar: Linked Data
Mustafa Jarrar
 
INSPIRE Hackathon Webinar Intro to Linked Data and Semantics
plan4all
 
Llinked open data training for EU institutions
Open Data Support
 
Linked Data to Improve the OER Experience
The Open Education Consortium
 
Consuming Linked Data SemTech2010
Juan Sequeda
 
Future of Web 2.0 & The Semantic Web
is20090
 
Get on the Linked Data Web!
Armin Haller
 
RDFa From Theory to Practice
Adrian Stevenson
 
Linked data and Semantic Web Applications for Libraries
Vikas Bhushan
 
Introduction to Linked Data 1/5
Juan Sequeda
 
Ad

More from Laura Po (13)

PPTX
Towards sustainable mobility for citizens and the environment @ AI, HPC and B...
Laura Po
 
PPTX
Big data analytics for smart and sustainable city galway
Laura Po
 
PPTX
TRAFAIR - Premio PA sostenibile 2019 - slide di presentazione
Laura Po
 
PDF
TRAFAIR - Premio PA sostenibile 2019
Laura Po
 
PPTX
Session 1 and 2 "Challenges and Opportunities with Big Linked Data Visualiza...
Laura Po
 
PDF
Session 3 "Challenges and Opportunities with Big Linked Data Visualization" t...
Laura Po
 
PDF
Building an urban theft map by analyzing newspaper - SMAP 2018
Laura Po
 
PDF
Linked Open Data Visualization
Laura Po
 
PPTX
Wi2015 - Clustering of Linked Open Data - the LODeX tool
Laura Po
 
PDF
Comparing topic models for a movie recommendation system webist2014
Laura Po
 
PPTX
An iPad Order Management System for Fashion Trade
Laura Po
 
PPTX
A Non-Intrusive Movie Recommendation System
Laura Po
 
PPTX
A meta language for mdx queries in e log business
Laura Po
 
Towards sustainable mobility for citizens and the environment @ AI, HPC and B...
Laura Po
 
Big data analytics for smart and sustainable city galway
Laura Po
 
TRAFAIR - Premio PA sostenibile 2019 - slide di presentazione
Laura Po
 
TRAFAIR - Premio PA sostenibile 2019
Laura Po
 
Session 1 and 2 "Challenges and Opportunities with Big Linked Data Visualiza...
Laura Po
 
Session 3 "Challenges and Opportunities with Big Linked Data Visualization" t...
Laura Po
 
Building an urban theft map by analyzing newspaper - SMAP 2018
Laura Po
 
Linked Open Data Visualization
Laura Po
 
Wi2015 - Clustering of Linked Open Data - the LODeX tool
Laura Po
 
Comparing topic models for a movie recommendation system webist2014
Laura Po
 
An iPad Order Management System for Fashion Trade
Laura Po
 
A Non-Intrusive Movie Recommendation System
Laura Po
 
A meta language for mdx queries in e log business
Laura Po
 

Recently uploaded (20)

PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
Python basic programing language for automation
DanialHabibi2
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
Python basic programing language for automation
DanialHabibi2
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 

Introduction to linked data

  • 1. Introduction to Linked Data Laura Po - Exploration, Visualization and Querying of Linked Open Data sources 2nd Keystone Training School - Keyword Search in Big Linked Data, University of Santiago de Compostela (USC), Spain. Laura Po
  • 3. Objectives By the end of this module you should have an understanding of • What is linked data • What is open data • What is the difference between linked and open data • How to publish linked data (5-star schema) • What are the linked data principles and the linked data technologies (the semantic web stack) • The economic and social impact of linked data
  • 5. The Web of Data The evolution from a Web of linked documents to a web of linked data The Web as a huge decentralized database (knowledge base) of machine- accessible data Web of documents... Web of linked data...
  • 6. The evolution of the web • The Web started as a collection of documents published online – accessible at Web location identified by a URL. • These documents often contain data about real- world resources which is mainly human-readable and cannot be understood by machines. • The Web of Data is about enabling the access to this data, by making it available in machine- readable formats and connecting it using Uniform Resource Identifiers (URIs), thus enabling people and machines to collect the data, and put it together to do all kinds of things with it (permitted by the licence). Machine-readable data (or metadata) is data in a format that can be interpreted by a computer. 2 types of machine-readable data: • human-readable data that is marked upso that it can also be understood by computers, e.g. microformats, RDFa; • data formats intended principally for computers, e.g. RDF, XML and JSON.
  • 7. Linked Data and the ‘Web of Data‘ ● Term refers to an idea originally from Tim Berners-Lee (Tim Berners-Lee, Linked Data, 2006, http://www.w3.org/DesignIssues/LinkedData.html) ● Set of best practices for publication and linking of structured data on the web ● Basic assumption: The value of data on the web increases when they are connected to other data sources M.Hausenblas, Quick Linked Data Introduction, http://www.slideshare. net/mediasemanticweb/quick-linked-data-introduction The Semantic Web isn't just about putting data on the web. It is about making links, so that a person or machine can explore the web of data. With linked data, when you have some of it, you can find other, related, data.
  • 8. Defining linked data “Linked data is a set of design principles for sharing machine-readable data on the Web for use by public administrations, business and citizens.” EC ISA Case Study: How Linked Data is transforming eGovernment
  • 9. Linked Data Principles 1. Use URIs as names for things. 2. Use HTTP URIs, so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) 4. Include links to other URIs, so that they can discover more things.
  • 11. How to get Data from the Web? ● Data can only be found on the Web, if it is available at some website JDBC Browser Web Server Database HTTP
  • 12. How to get Data from the Web? ● There is a number of different (proprietary) Web APIs, data exchange formats and Mashups on top of that Database 1 Database 2 Database 3 Database 4 Web API 1 Web API 2 Web API 3 Web API 4 Mashup
  • 13. In the Web today... ● Data is locked up in small data islands ● Other applications usually cannot access this data... Database Database Database Database Database Database Database Database Database Database
  • 14. Semantic Web Technologies , Dr. Harald Sack, Hasshttp://www.w3.org/2009/Talks/0204-ted-tbl/#(22)
  • 15. How to get rid of Closed Data Islands? Database 1 Database 2 Database 3 Database 4 ● Apply Semantic Web technologies ○ to publish (structured) data on the web ○ to draw connections from one data source to data from other data sources RDF data RDF data RDF data RDF data
  • 16. Linked Data Principles (1/4) 1. Use URIs as names for things. ○ URIs do not only identify documents but also arbitrary objects of the real world as well as abstract concepts https://viaf.org/viaf/32197206/ http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart http://musicbrainz.org/artist/20244d07-534f-4eff-b4d4-930878889970 http://www.imdb.com/title/tt3659388
  • 17. Linked Data Principles (2/4) 2. Use HTTP URIs, so that people can look up those names. ○ HTTP URIs (URLs) as globally unique names enable dereferencing of associated information in the Web ○ via http Content Negotiation machine and humans can access the resource identified by the URI RDF Document URI represents Designatum http://dbpedia.org/resource/ Wolfgang_Amadeus_Mozart http://dbpedia.org/page/ Wolfgang_Amadeus_Mozart http://dbpedia.org/data/ Wolfgang_Amadeus_Mozart URI represents Designator URI represents Designator HTML Document FOR MACHINE FOR HUMANS Dereferencable Every term in a LOD source must be accessible via its URI through an HTTP GET. Once we access the URI we found the definition of the term.
  • 18. Linked Data Principles (3/4) 3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) ○ RDF as universal data model for publishing structured data on the Web ○ Make all URIs in the RDF graph dereferenceable ○ Avoid RDF constructs that cause problems in Linked Data context ■ RDF Reification ■ RDF Collections und Containers ■ unnamed Blank Nodes
  • 19. Linked Data Principles (4/4) 4. Include links to other URIs, so that they can discover more things. ○ Link RDF references among data between different data sources: ○ owl:sameAs –create a link between individuals ○ rdfs:seeAlso – states that a resource may provide additional information ○ Relationship Links Links to external LOD Entitites related with the original entity ○ Identity Links Links to external LOD Entities referring to the same object or concept ○ Vocabulary Links Links to definitions of the original entity
  • 20. Advantages of Linked Open Data vs. APIs ○ Simple and generic API for various heterogeneous data sources enables simple reuse and data sharing among applications ○ RDF Data model guarantees (simple) extensibility ○ Transport via http, standard Port 80, prevents firewall adaption ○ Ontologies enable meaningful connections between data sources ○ Reasoning over Linked Data enables to generate new knowledge, i.e. inference from implicit to explicit knowledge
  • 23. The Semantic Web Technology Stack http://dbpedia.org/resource/ Santiago_de_Compostela Santiago de Compostela URI - Uniform Resource Identifier
  • 24. From Wikipedia to DBpedia https://en.wikipedia.org/wiki/ Santiago_de_Compostela http://dbpedia.org/resource/Santiago_de_Compostela
  • 25. From Wikipedia to DBpedia http://dbpedia.org/resource/Santiago_de_Compostela
  • 26. RDF Resource Description Framework :Santiago_de_Compostela rdf:type dbo:City . :Santiago_de_Compostela dbo:country dbr:Spain . :Santiago_de_Compostela owl:sameAs geodata:Santiago di Compostela . dbr:University_of_Santiago_de_Compostela dbp:city dbr:Santiago_de_Compostela . :Santiago_de_Compostela dbp:populationTotal 95671 (xsd:integer) . ... :Santiago rdf:type dbo:City . RDF Subject RDF Property RDF Object RDF Triple From Wikipedia to DBpedia http://dbpedia.org/resource/Santiago_de_Compostela
  • 27. ● Resource ○ can be everything ○ must be uniquely identified and referencable via URI ● Description ○ = description of resources ○ via representing properties and relationships among resources as graphs ● Framework ○ = combination of web based protocolls (URI, HTTP, XML, Turtle, JSON, …) ○ based on formal model (semantics) ● Knowledge in RDF is expressed as a list of statements ● all RDF statements follow the same simple schema (= RDF Triple) Resource Description Framework
  • 28. Resource Description Framework ● RDF Statements (RDF-Triple): + Object / ValueSubject + Property URI URI URI / Literal RDF Building Blocks <http://dbpedia.org/resource /Santiago_de_Compostela> <http://dbpedia.org/ontology/ populationTotal> N-Triples Serialization “95671” . graph representation <http://dbpedia.org/resource /Santiago_de_Compostela> <http://dbpedia.org/ontology/ populationTotal> “95671” .
  • 29. Resource Description Framework ● URIs and Literals ○ URIs reference resources uniquely ○ Literals describe data values that don’t have a separate existence <http://dbpedia.org/resource/Spain> <http://dbpedia.org/ontology /country> <http://dbpedia.org/resource /Santiago_de_Compostela> <http://dbpedia.org/ontology /populationTotal> “95671” .
  • 30. RDF Schema dbo:City rdf:type owl:class . dbo:City rdfs:subClassOf dbo:Settlement . dbo:foundationPlace rdfs:range dbo:City. ... City foundation Place Settlement rdfs:isSubclassOf The Semantic Web Technology Stack http://dbpedia.org/ontology/City rdfs:range
  • 31. logical constraint City Spain Madrid dbo:country Small_town ∩ Capital = ∅ rdf:type rdfs:isSubclassOf ∀x. ( City(x)∧ seatOfGovernment(x) → Capital(x) ) description logics + logical rules classes entities The Semantic Web Technology Stack
  • 32. Look for a l l cities located i n the same area of Santiago de Compostela (use the property dbp:subdivisionName) PREFIX dcterms: <http://purl.org/dc/terms/> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX dbp: <http://dbpedia.org/property/> PREFIX dbr: <http://dbpedia.org/resource/> SELECT distinct ?area ?city FROM <http://dbpedia.org/> WHERE{ ?area dbp:subdivisionName dbr:Santiago_de_Compostela. ?area dbp:subdivisionName ?city. } The Semantic Web Technology Stack http://dbpedia.org/sparql
  • 33. http://dbpedia.org/sparql Look fo r a l l cities located i n the same area of Santiago de Compostela (use the property dbp:subdivisionName)
  • 34. Query language designed to use a syntax similar to SQL for retrieving data from relational databases. Different query forms: • SELECT returns variables and their bindings directly. • CONSTRUCT returns a single RDF graph specified by a graph template. • ASK test whether or not a query pattern has a solution. Returns yes/no. • DESCRIBE returns a single RDF graph containing RDF data about resources. SPARQL – * Protocol and RDF Query Language
  • 35. SQL versus SPARQL SQL SPARQL Based on relations (tables). Based on labelled directed graphs. The relations (tables) to be matched over should be indicated. Assumes a default graph. (The FROM clause populates this with specific identified subgraphs). (Retrieval) queries produce a relation from a relation. SPARQL SELECT queries produce a relation from a graph. CONSTRUCT queries (considered later) produce a graph from a graph.
  • 37. The application of the Linked Data Principles leads to a ,Web of Data‘ >1014Datasets >74B RDF Triples 808M Links as of August 2014
  • 38. The Development of the Web of Data May 2007
  • 39. The Development of the Web of Data Nov 2007
  • 40. The Development of the Web of Data
  • 41. The Development of the Web of Data July 2009
  • 42. The Development of the Web of Data Aug 2014
  • 43. Linked Open Data ○ Public Linked Data resources in the Web, licensed as Creative Common CC-BY ○ Tim Berners-Lee‘s 5-Star Criteria for Linked Open Data ★★ ★★★ Available on the web (whatever format) but with an open licence, to be Open Data Available as machine-readable structured data (e.g. excel instead of image scan of a table) as (2) plus non-proprietary format (e.g. CSV instead of excel) ★★★★★ All the above, plus: link your data to other people’s data to provide context ★★★★ All the above plus: use open standards from W3C (URI,RDF and SPARQL) to identify things, so that people can point at your stuff ★
  • 46. December 2007 8 principles for the Open Government Data: Complete Primary (not aggregate) Up to date Accessible Machine processable Non-discriminatory Non-proprietary No license fees https://opengovdata.org/
  • 47. Open data Data can be published and be publicly available under an open licence without linking to other data sources. Linked data Data can be linked to URIs from other data sources, using open standards such as RDF without being publicly available under an open licence. “Open data is data that can be freely used, reused and redistributed by anyone – subject only, at most, to the requirement to attribute and sharealike.” - OpenDefinition.org Seealso: Cobden et al., A research agenda for Linked ClosedData http://ceur-ws.org/Vol-782/CobdenEtAl_COLD2011.pdf Linked Data vs open Data
  • 48. • Flexible data integration: LOGD facilitates data integration and enables the interconnection of previously disparate government datasets. • Increase in data quality: The increased (re)use of LOGD triggers a growing demand to improve data quality. Through crowd-sourcing and self-service mechanisms, errors are progressively corrected. • New services: The availability of LOGD gives rise to new services offered by the public and/or private sector. • Cost reduction: The reuse of LOGD in e-Government applications leads to considerable cost reductions. Seealso: ISA Study on Business Models for LOGD https://joinup.ec.europa.eu/community/semic/document/study-business- models-linked-open-government-data-bm4logd Linked (open) governament data
  • 49. Key milestones for linked government data
  • 51. Linked Data - A Guided Tour ● Datasets ordered by category http://linkeddatacatalog.dws.informatik.uni-mannheim.de/state/
  • 52. Government ● 183 datasets ● top 10 highest indegree: reference.data.gov.uk ● 48 proprietary vocabularies used ● c. 21% fully dereferencable Dereferencable Every term in a LOD source must be accessible via its URI through an HTTP GET. Once we access the URI we found the definition of the term. The dereferencability quota of a LOD source is define as the number of dereferencable terms divided by all terms collected into the source. fully dereferencable LOD source – there exist a definition for all URIs partially dereferencable LOD source - for some terms, but not for all, a definition could be retrieved
  • 53. Media ● 22 datasets ● 22 proprietary vocabularies used ● 0% fully dereferencable ● 9% partially dereferencable
  • 54. User Generated Content ● 48 datasets ● top 10 highest outdegree: semanticweb.org ● 30 proprietary vocabularies used ● 13% fully dereferencable ● 10% partially dereferencable
  • 55. Linguistics ● no statistics available so far
  • 56. Bibliographic Data ● 96 datasets ● top 10 highest indegree: data.semanticweb.org ● top 10 highest outdegree: bibsonomy.org ● 58 proprietary vocabularies used ● 21% fully dereferencable ● 7% partially dereferencable
  • 57. ● 83 datasets ● 35 proprietary vocabularies used ● 28% fully dereferencable ● 6% partially dereferencable Life Sciences
  • 58. Cross Domain ● 41 datasets ● top 10 highest indegree: dbpedia.org, w3.org, lexvo.org ● 55 proprietary vocabularies used ● 27% fully dereferencable ● 11% partially dereferencable
  • 59. Social Networking ● 520 datasets ● top 10 highest indegree: quitter.se, status.net, … ● top 10 highest outdegree: deri.org, harth.org,... ● 128 proprietary vocabularies used ● 16% fully dereferencable ● 6% partially dereferencable
  • 60. Semantic Web Technologies , Dr. Harald Sack, Hasso Plattner Insti Geographic ● 21 datasets ● top 10 highest indegree: geonames.org ● 24 proprietary vocabularies used ● 21% fully dereferencable ● 4% partially dereferencable
  • 61. Linked Data Ontologies ● Ontologies hold the Linked Data Cloud together ● OWL owl:sameAs connects identical individuals owl:equivalentClass connects equivalent classes
  • 62. Linked Data Ontologies ● Ontologies hold the Linked Data Cloud together ● SKOS ○ „Simple Knowledge Organization System“ ○ based on RDF and RDFS ○ applied for definitions and mappings of vocabularies and ontologies ■ skos:Concept (classes) ■ skos:narrower ■ skos:broader ■ skos:related ■ skos:exactMatch (vacabulary) ■ skos:narrowMatch ■ skos:broadMatch ■ skos:relatedMatch
  • 63. Linked Data Ontologies ● Ontologies hold the Linked Data Cloud together ● umbel ○ „Upper Mapping and Binding Exchange Layer“ ○ Subset of OpenCycas RDF Triples based on SKOS and OWL2 ○ Upper Ontology with 28.000 concepts (skos:Concept) ○ 46.000 Mappings into DBpedia, geonames, e.a. (owl:equivalentClass, rdfs: subClassOf) ○ Links to more than 2 Mio Wikipedia pages
  • 65. Member State initiatives – some examples Some examples on supra-national, national, regional and private initiatives in the area of linked (open) data across Europe. DE – Bibliotheksverbund Bayern Linked data from 180 academic libraries in Bavaria, Berlin and Brandenburg. IT – Agenzia per l’Italia digitiale Three datasets published as linked data: the Index of Public Administration, the SPC contracts for web services and conduction systems and the Classifications for the data in Public Administration. NL – Building and address register The Dutch Address and Buildings base register published as linked data. UK – Ordnance Survey Three OS Open Data products published as linked data: the 1:50 000 Scale Gazetteer, Code-Point Open and the administrative geography taken from Boundary Line. UK – Companies House Publishing basic company details as linked data using a simple URI for each company in their database. Seealso: ISA Study on Business Models for LOGD https://joinup.ec.europa.eu/community/semic/document/study-business- models-linked-open-government-data-bm4logd
  • 66. Linked Government Data & Metadata initiatives funded by the European Commission ADMS. SW CORE PUBLIC SERVICE VOCABULARY
  • 67. Linked Government Data Pilots http://health.testproject. eu/PPP/ http://maritime.testproject. eu/CISE/ http://cpsv.testproject.e u/CPSV/
  • 69. Conclusion • Linked data is a set of design principles for sharing machine-readable data on the Web. • Linked data and open data are not the same. • URIs, RDF and SPARQL form the foundational layer for Linked data. • Linked data offers a number of advantages for: • Data integration with small impact on legacy systems; • Enables for semantic interoperability; • Enables creativity and innovation through context and knowledge- creation.
  • 70. Group questions Is there supply and demand for (Linked) Open Government Data in your country? What are, in your opinion, the expected benefits and pitfalls of Linked Data? Do you know if there are any Linked (Open) Data initiatives in your country? If so, how many stars would you give them?
  • 72. Download the slide from My research group website www.dbgroup.unimore.it On slide share http://www.slideshare.net/polaura
  • 73. References Some of the materials used in these slides have been rearranged from - Slides of the “Knowledge Engineering with Semantic Web Technologies 2015” course held by Dott. Harald Sack https://open.hpi.de/courses/semanticweb2015 - Slides of the "Introduction to linked data" of Open Data Support http://www.slideshare.net/OpenDataSupport/introduction-to-linked-data- 23402165 - Slides of "Usage of Linked Data Introduction and Application Scenarios « and "Querying Linked Data" by Barry Norton, EUCLID project
  • 74. Further readings Linked Open Government Data. Li Ding Qualcomm, Vassilios Peristeras and Michael Hausenblas. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6237454 EUCLID - Course 1: Introduction and Application Scenarios http://www.euclid- project.eu/modules/course1 Linked Open Data: The Essentials. Florian Bauer, Martin Kaltenböck. http://www.semantic-web.at/LOD-TheEssentials.pdf Linked Data: Evolving the Web into a Global Data Space. Tom Heath and Christian Bizer. http://linkeddatabook.com/editions/1.0/
  • 75. LOD2 FP7 project, http://lod2.eu/ The Open Knowledge Foundation, http://okfn.org/ W3C Semantic Web, http://www.w3.org/standards/semanticweb/ EUCLID, http://projecteuclid.org/ ISA Programme, http://ec.europa.eu/isa/ W3C LOGD WG, http://www.w3.org/2011/gld/wiki/Main_Page LOD Around The Clock FP7 project, http://latc-project.eu/ Data.gov.uk, http://data.gov.uk/linked-data Related projects and initiatives