SlideShare a Scribd company logo
Querying
                                  Linked Data
                                      with
                                    SPARQL

ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Brief Introduction to SPARQL
     ●   SPARQL: Query Language for RDF data
     ●   Main idea: pattern matching
          ●   Describe subgraphs of the queried RDF graph
          ●   Subgraphs that match your description yield a result
          ●   Mean: graph patterns (i.e. RDF graphs /w variables)

                                            ?v               rdf:type
                                                                        http://.../Volcano




ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Brief Introduction to SPARQL
 Queried
 graph:
                                                             rdf:type
        http://.../Mount_Baker                                             http://.../Volcano
         p:lastEruption                                                             rdf:type
                                         "1880"                         htp://.../Mount_Etna



                                            ?v               rdf:type
   Results:                                                               http://.../Volcano
                      ?v
   http://.../Mount_Baker
   http://.../Mount_Etna
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
SPARQL Endpoints
     ●   Linked data sources usually provide a
         SPARQL endpoint for their dataset(s)
     ●   SPARQL endpoint: SPARQL query processing
         service that supports the SPARQL protocol*
     ●   Send your SPARQL query, receive the result




                    * http://www.w3.org/TR/rdf-sparql-protocol/

ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
SPARQL Endpoints
             Data Source                                     Endpoint Address

     DBpedia                                 http://dbpedia.org/sparql

     Musicbrainz                             http://dbtune.org/musicbrainz/sparql

     U.S. Census                             http://www.rdfabout.com/sparql

     Semantic Crunchbase http://cb.semsol.org/sparql


     More complete list:
      http://esw.w3.org/topic/SparqlEndpoints
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Accessing a SPARQL Endpoint
     ●   SPARQL endpoints: RESTful Web services
     ●   Issuing SPARQL queries to a remote SPARQL
         endpoint is basically an HTTP GET request to
         the SPARQL endpoint with parameter query

    GET /sparql?query=PREFIX+rd... HTTP/1.1
    Host: dbpedia.org
    User-agent: my-sparql-client/0.1
                                                              URL-encoded string
                                                             with the SPARQL query
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Query Results Formats
     ●   SPARQL endpoints usually support different
         result formats:
          ●   XML, JSON, plain text
              (for ASK and SELECT queries)
          ●   RDF/XML, NTriples, Turtle, N3
              (for DESCRIBE and CONSTRUCT queries)




ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Query Results Formats
    PREFIX dbp: <http://dbpedia.org/ontology/>
    PREFIX dbpprop: <http://dbpedia.org/property/>

    SELECT ?name ?bday WHERE {
    ?p dbp:birthplace <http://dbpedia.org/resource/Berlin> ;
       dbpprop:dateOfBirth ?bday ;
       dbpprop:name ?name .
    }
        name                            | bday
      ------------------------+------------
        Alexander von Humboldt | 1769-09-14
        Ernst Lubitsch                 | 1892-01-28
                                      ...
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
<?xml version="1.0"?>
<sparql xmlns="http://www.w3.org/2005/sparql-results#">
  <head>
    <variable name="name"/>
    <variable name="bday"/>
  </head>
  <results distinct="false" ordered="true">
   <result>
      <binding name="name">
         <literal xml:lang="en">Alexander von Humboldt</literal>
      </binding>
      <binding name="bday">
         <literal datatype="http://www.w3.org/2001/XMLSchema#date">1769-09-14</literal>
      </binding>
   </result>
   <result>
      <binding name="name">
         <literal xml:lang="en">Ernst Lubitsch</literal>
      </binding>
      <binding name="bday">
         <literal datatype="http://www.w3.org/2001/XMLSchema#date">1892-01-28</literal>
      </binding>
   </result>          http://www.w3.org/TR/rdf-sparql-XMLres/
   <!-- … -->
  </results>
 ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
</sparql>
{
 "head": { "link": [], "vars": ["name", "bday"] },
 "results": { "distinct": false, "ordered": true, "bindings": [
 { "name": { "type": "literal",
             "xml:lang": "en",
             "value": "Alexander von Humboldt" } ,
   "bday": { "type": "typed-literal",
             "datatype": "http://www.w3.org/2001/XMLSchema#date",
             "value": "1769-09-14" }
 },
 { "name": { "type": "literal",
             "xml:lang": "en",
             "value": "Ernst Lubitsch" } ,
   "bday": { "type": "typed-literal",
             "datatype": "http://www.w3.org/2001/XMLSchema#date",
             "value": "1892-01-28" }
  },
// ...
     ] }                    http://www.w3.org/TR/rdf-sparql-json-res/
}
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Query Result Formats
     ●   Use the ACCEPT header to request the
         preferred result format:
   GET /sparql?query=PREFIX+rd... HTTP/1.1
   Host: dbpedia.org
   User-agent: my-sparql-client/0.1
   Accept: application/sparql-results+json




ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Query Result Formats
     ●   As an alternative some SPARQL endpoint
         implementations (e.g. Joseki) provide an
         additional parameter out

 GET /sparql?out=json&query=... HTTP/1.1
 Host: dbpedia.org
 User-agent: my-sparql-client/0.1




ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Accessing a SPARQL Endpoint
     ●   More convenient: use a library
     ●   Libraries:
          ●   SPARQL JavaScript Library
              http://www.thefigtrees.net/lee/blog/2006/04/sparql_calendar_demo_a_sparql.html
          ●   ARC for PHP
              http://arc.semsol.org/
          ●   RAP – RDF API for PHP
              http://www4.wiwiss.fu-berlin.de/bizer/rdfapi/index.html




ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Accessing a SPARQL Endpoint
     ●   Libraries (cont.):
          ●   Jena / ARQ (Java) http://jena.sourceforge.net/
          ●   Sesame (Java) http://www.openrdf.org/
          ●   SPARQL Wrapper (Python)
              http://sparql-wrapper.sourceforge.net/
          ●   PySPARQL (Python)
              http://code.google.com/p/pysparql/




ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Accessing a SPARQL Endpoint
     ●   Example with Jena / ARQ:
   import com.hp.hpl.jena.query.*;

   String service = "..."; // address of the SPARQL endpoint
   String query = "SELECT ..."; // your SPARQL query
   QueryExecution e = QueryExecutionFactory.sparqlService( service,
                                                             query );
   ResultSet results = e.execSelect();
   while ( results.hasNext() ) {
       QuerySolution s = results.nextSolution();
       // …
   }
   e.close();
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
●   Querying a single dataset is quite boring
                                             compared to:
     ●   Issuing SPARQL queries over multiple datasets

     ●   How can you do this?
          1. Issue follow-up queries to different endpoints
          2. Querying a central collection of datasets
          3. Build store with copies of relevant datasets
          4. Use query federation system



ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Follow-up Queries
     ●   Idea: issue follow-up queries over other
         datasets based on results from previous
         queries
     ●   Substituting placeholders in query templates




ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
String s1 = "http://cb.semsol.org/sparql";
   String s2 = "http://dbpedia.org/sparql";

   String qTmpl = "SELECT ?c WHERE{ <%s> rdfs:comment ?c }";

   String q1 = "SELECT ?s WHERE { ...";
   QueryExecution e1 = QueryExecutionFactory.sparqlService(s1,q1);
   ResultSet results1 = e1.execSelect();
   while ( results1.hasNext() ) {
     QuerySolution s1 = results.nextSolution();
     String q2 = String.format( qTmpl, s1.getResource("s"),getURI() );
     QueryExecution e2= QueryExecutionFactory.sparqlService(s2,q2);
     ResultSet results2 = e2.execSelect();
     while ( results2.hasNext() ) {
       // ...
     }                               Find a list of companies
     e2.close();
   }
                                 filtered by some criteria and
   e1.close();                  return DBpedia URIs of them
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Follow-up Queries
     ●   Advantage:
          ●   Queried data is up-to-date
     ●   Drawbacks:
          ●   Requires the existence of a SPARQL endpoint for
              each dataset
          ●   Requires program logic
          ●   Very inefficient




ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Querying a Collection of Datasets
     ●   Idea: Use an existing SPARQL endpoint that
         provides access to a set of copies of relevant
         datasets
     ●   Example:
          ●   SPARQL endpoint by OpenLink SW over a majority
              of datasets from the LOD cloud at:
                                  http://lod.openlinksw.com/sparql




ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Querying a Collection of Datasets
     ●   Advantage:
          ●   No need for specific program logic
     ●   Drawbacks:
          ●   Queried data might be out of date
          ●   Not all relevant datasets in the collection




ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Own Store of Dataset Copies
     ●   Idea: Build your own store with copies of
         relevant datasets and query it
     ●   Possible stores:
          ●   Jena TDB http://jena.hpl.hp.com/wiki/TDB
          ●   Sesame http://www.openrdf.org/
          ●   OpenLink Virtuoso http://virtuoso.openlinksw.com/
          ●   4store http://4store.org/
          ●   AllegroGraph http://www.franz.com/agraph/
          ●   etc.

ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Own Store of Dataset Copies
     ●   Advantages:
          ●   No need for specific program logic
          ●   Can include all datasets
          ●   Independent of the existence, availability, and
              efficiency of SPARQL endpoints
     ●   Drawbacks:
          ●   Requires effort to set up and to operate the store
          ●   Ideally, data sources provide RDF dumps; if not?
          ●   How to keep the copies in sync with the originals?
          ●   Queried data might be out of date
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Federated Query Processing
     ●   Idea: Querying a mediator which                         ?
         distributes subqueries to
         relevant sources and
         integrates the results
                                                             ?
                                                                 ?   ?



ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Federated Query Processing
     ●   Instance-based federation
          ●   Each thing described by only one data source
          ●   Untypical for the Web of Data
     ●   Triple-based federation
          ●   No restrictions
          ●   Requires more distributed joins


     ●   Statistics about datasets requires (both cases)

ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Federated Query Processing
     ●   DARQ (Distributed ARQ)
                            http://darq.sourceforge.net/
          ●   Query engine for federated SPARQL queries
          ●   Extension of ARQ (query engine for Jena)
          ●   Last update: June 28, 2006




ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Federated Query Processing
     ●   Semantic Web Integrator and Query Engine
         (SemWIQ)        http://semwiq.sourceforge.net/
          ●   Actively maintained by Andreas Langegger




ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Federated Query Processing
     ●   Advantages:
          ●   No need for specific program logic
          ●   Queried data is up to date
     ●   Drawbacks:
          ●   Requires the existence of a SPARQL endpoint for
              each dataset
          ●   Requires effort to set up and configure the mediator




ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
In any case:
     ●   You have to know the relevant data sources
          ●   When developing the app using follow-up queries
          ●   When selecting an existing SPARQL endpoint over
              a collection of dataset copies
          ●   When setting up your own store with a collection of
              dataset copies
          ●   When configuring your query federation system
     ●   You restrict yourself to the selected sources



ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
In any case:
     ●   You have to know the relevant data sources
          ●   When developing the app using follow-up queries
          ●   When selecting an existing SPARQL endpoint over
              a collection of dataset copies
          ●   When setting up your own store with a collection of
              dataset copies
          ●   When configuring your query federation system
     ●   You restrict yourself to the selected sources
                    There is an alternative:
                Remember, URIs link to data
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Automated
                              Link Traversal


ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Automated Link Traversal
     ●   Idea: Discover further data by looking-up
         relevant URIs in your application
     ●   Can be combined with the previous approaches




ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Link Traversal Based
                             Query Execution
     ●   Applies the idea of automated link traversal to the
         execution of SPARQL queries
     ●   Idea:
          ●   Intertwine query evaluation with traversal of RDF links
          ●   Discover data that might contribute to query results
              during query execution
     ●   Alternately:
          ●   Evaluate parts of the query
          ●   Look up URIs in intermediate solutions

                                                             Queried data
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Link Traversal Based
                             Query Execution
 SELECT ?c ?u WHERE {
   <http://mymovie.db/movie2449> mov:filming_location ?c .
   ?c geo:statistics ?cStats .
   ?cStats stat:unempRate ?u . }

     ●   Example:
         Return unemployment rate of the countries in
         which the movie http://mymovie.db/movie2449
         was filmed.

                                                             Queried data
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Link Traversal Based
                             Query Execution
 SELECT ?c ?u WHERE {
   <http://mymovie.db/movie2449> mov:filming_location ?c .
   ?c geo:statistics ?cStats .
                                                                                             49
                                                                                      v ie24
   ?cStats stat:unempRate ?u                                 . }              .d b/mo
                                                                       m ovie
                                                             http ://my       ?




                                                                                                  Queried data
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Link Traversal Based
                             Query Execution
 SELECT ?c ?u WHERE {
   <http://mymovie.db/movie2449> mov:filming_location ?c .
   ?c geo:statistics ?cStats .
   ?cStats stat:unempRate ?u . }




                                                             Queried data
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Link Traversal Based
                             Query Execution
 SELECT ?c ?u WHERE {
   <http://mymovie.db/movie2449> mov:filming_location ?c .
   ?c geo:statistics ?cStats .
   ?cStats stat:unempRate ?u . }




                                                  ...
   <http://mymovie.db/movie2449>
          mov:filming_location <http://geo.../Italy> .
                                                             Queried data
                                                  ...
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Link Traversal Based
                             Query Execution
 SELECT ?c ?u WHERE {
   <http://mymovie.db/movie2449> mov:filming_location ?c .
   ?c geo:statistics ?cStats .                                      ?loc
   ?cStats stat:unempRate ?u . }                             http://geo.../Italy




                                                  ...
   <http://mymovie.db/movie2449>
          mov:filming_location <http://geo.../Italy> .
                                                                 Queried data
                                                  ...
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Link Traversal Based
                             Query Execution
 SELECT ?c ?u WHERE {
   <http://mymovie.db/movie2449> mov:filming_location ?c .
   ?c geo:statistics ?cStats .                                                               ?loc
   ?cStats stat:unempRate ?u . }                                                      http://geo.../Italy
                                                                               taly
                                                                        o.../I
                                                                  / / ge ?
                                                             http:




                                                                                          Queried data
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Link Traversal Based
                             Query Execution
 SELECT ?c ?u WHERE {
   <http://mymovie.db/movie2449> mov:filming_location ?c .
   ?c geo:statistics ?cStats .                                                              ?loc
   ?cStats stat:unempRate ?u . }                                                ly
                                                                                     http://geo.../Italy
                                                                      eo .../Ita
                                                             http://g       ?




                                                                                         Queried data
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Link Traversal Based
                             Query Execution
 SELECT ?c ?u WHERE {
   <http://mymovie.db/movie2449> mov:filming_location ?c .
   ?c geo:statistics ?cStats .                                      ?loc
   ?cStats stat:unempRate ?u . }                             http://geo.../Italy




                                                                 Queried data
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Link Traversal Based
                             Query Execution
 SELECT ?c ?u WHERE {
   <http://mymovie.db/movie2449> mov:filming_location ?c .
   ?c geo:statistics ?cStats .                                      ?loc
   ?cStats stat:unempRate ?u . }                             http://geo.../Italy




                                                  ...
   <http://geo.../Italy>
           geo:statistics <http://example.db/stat/IT> .
                                                  ...            Queried data
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Link Traversal Based
                             Query Execution
 SELECT ?c ?u WHERE {
   <http://mymovie.db/movie2449> mov:filming_location ?c .
   ?c geo:statistics ?cStats .                                                           ?loc
   ?cStats stat:unempRate ?u . }                                                  http://geo.../Italy


                                                                   ?loc                  ?stat
                                                             http://geo.../Italy http://stats.db/../it



                                                  ...
   <http://geo.../Italy>
           geo:statistics <http://example.db/stat/IT> .
                                                  ...                                  Queried data
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Link Traversal Based
                             Query Execution
 SELECT ?c ?u WHERE {
   <http://mymovie.db/movie2449> mov:filming_location ?c .
   ?c geo:statistics ?cStats .                                                           ?loc
   ?cStats stat:unempRate ?u . }                                                  http://geo.../Italy


                                                                   ?loc                  ?stat
                                                             http://geo.../Italy http://stats.db/../it

     ●   Proceed with this strategy
         (traverse RDF links
          during query execution)

                                                                                       Queried data
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Link Traversal Based
                             Query Execution
     ●   Advantages:
          ●   No need to know all data sources in advance
          ●   No need for specific programming logic
          ●   Queried data is up to date
          ●   Independent of the existence of SPARQL endpoints
              provided by the data sources
     ●   Drawbacks:
          ●   Not as fast as a centralized collection of copies
          ●   Unsuitable for some queries
          ●   Results might be incomplete
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Implementations
     ●   Semantic Web Client library (SWClLib) for Java
         http://www4.wiwiss.fu-berlin.de/bizer/ng4j/semwebclient/
     ●   SWIC for Prolog http://moustaki.org/swic/




ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Implementations
     ●   SQUIN http://squin.org
          ●   Provides SWClLib functionality as a Web service
          ●   Accessible like a SPARQL endpoint
          ●   Public SQUIN service at:
                       http://squin.informatik.hu-berlin.de/SQUIN/
          ●   Install package: unzip and start
          ●   Convenient access with SQUIN PHP tools:

               $s = 'http:// …'; // address of the SQUIN service
               $q = new SparqlQuerySock( $s, '… SELECT ...' );
               $res = $q->getJsonResult(); // or getXmlResult()
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
Real-World Examples
 SELECT DISTINCT ?author ?phone WHERE {
     ?pub swc:isPartOf
           <http://data.semanticweb.org/conference/eswc/2009/proceedings> .
     ?pub swc:hasTopic ?topic . ?topic rdfs:label ?topicLabel .
     FILTER regex( str(?topicLabel), "ontology engineering", "i" ) .

                                                                # of query results         2
     ?pub swrc:author ?author .                               # of retrieved graphs      297
     { ?author owl:sameAs ?authorAlt }                       # of accessed servers        16
     UNION                                                     avg. execution time    1min 30sec
     { ?authorAlt owl:sameAs ?author }
                                                                   Return
     ?authorAlt foaf:phone ?phone .                       phone numbers of authors
                                                       of ontology engineering papers
 }
                                                                at ESWC'09.
ISWC 2009 Tutorial "How to Consume Linked Data on the Web"

More Related Content

What's hot (20)

PPTX
RDF data model
Jose Emilio Labra Gayo
 
PDF
PySpark in practice slides
Dat Tran
 
PPTX
SPARQL introduction and training (130+ slides with exercices)
Thomas Francart
 
PPT
SPARQL Tutorial
Leigh Dodds
 
PPT
PHP POWERPOINT SLIDES
Ismail Mukiibi
 
PPTX
Introduction to SPARQL
Jose Emilio Labra Gayo
 
PPT
SPARQL in a nutshell
Fabien Gandon
 
PPTX
Resource description framework
hozifa1010
 
PPTX
Apresentação rest api
Matheus Petroni
 
PPTX
Building Next-Generation Web APIs with JSON-LD and Hydra
Markus Lanthaler
 
PPTX
Semantic web meetup – sparql tutorial
AdonisDamian
 
PDF
Jena – A Semantic Web Framework for Java
Aleksander Pohl
 
PDF
Spark SQL
Joud Khattab
 
PDF
Introduction to Model-Based Machine Learning
Daniel Emaasit
 
PPTX
Introduction to RDF Data Model
Cesar Augusto Nogueira
 
PDF
High-speed Database Throughput Using Apache Arrow Flight SQL
ScyllaDB
 
PPTX
Learn Apache Spark: A Comprehensive Guide
Whizlabs
 
PDF
Secrets of the DSpace Submission Form
Bram Luyten
 
PPTX
RDF, linked data and semantic web
Jose Emilio Labra Gayo
 
PPTX
Avro introduction
Nanda8904648951
 
RDF data model
Jose Emilio Labra Gayo
 
PySpark in practice slides
Dat Tran
 
SPARQL introduction and training (130+ slides with exercices)
Thomas Francart
 
SPARQL Tutorial
Leigh Dodds
 
PHP POWERPOINT SLIDES
Ismail Mukiibi
 
Introduction to SPARQL
Jose Emilio Labra Gayo
 
SPARQL in a nutshell
Fabien Gandon
 
Resource description framework
hozifa1010
 
Apresentação rest api
Matheus Petroni
 
Building Next-Generation Web APIs with JSON-LD and Hydra
Markus Lanthaler
 
Semantic web meetup – sparql tutorial
AdonisDamian
 
Jena – A Semantic Web Framework for Java
Aleksander Pohl
 
Spark SQL
Joud Khattab
 
Introduction to Model-Based Machine Learning
Daniel Emaasit
 
Introduction to RDF Data Model
Cesar Augusto Nogueira
 
High-speed Database Throughput Using Apache Arrow Flight SQL
ScyllaDB
 
Learn Apache Spark: A Comprehensive Guide
Whizlabs
 
Secrets of the DSpace Submission Form
Bram Luyten
 
RDF, linked data and semantic web
Jose Emilio Labra Gayo
 
Avro introduction
Nanda8904648951
 

Similar to Querying Linked Data with SPARQL (20)

PDF
Querying Linked Data with SPARQL (2010)
Olaf Hartig
 
PDF
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Olaf Hartig
 
PDF
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Olaf Hartig
 
PPTX
Consuming Linked Data 4/5 Semtech2011
Juan Sequeda
 
PDF
Answers to usual issues in getting started with consuming Linked Data
Olaf Hartig
 
PPTX
Madrid SPARQL handson
Victor de Boer
 
PDF
Answers to usual issues in getting started with consuming Linked Data (2010)
Olaf Hartig
 
PPT
Re-using Media on the Web: Media fragment re-mixing and playout
MediaMixerCommunity
 
PDF
Web Spa
Constantin Stan
 
PDF
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Olaf Hartig
 
PDF
Visualize open data with Plone - eea.daviz PLOG 2013
Antonio De Marinis
 
PPTX
Hack U Barcelona 2011
Peter Mika
 
PPTX
SWT Lecture Session 4 - SW architectures and SPARQL
Mariano Rodriguez-Muro
 
PPT
Sparql
Serge Garlatti
 
ODP
State of the Semantic Web
Ivan Herman
 
ODP
SPARQL 1.1 Update (2013-03-05)
andyseaborne
 
PPTX
GDG Meets U event - Big data & Wikidata - no lies codelab
CAMELIA BOBAN
 
PPTX
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
net2-project
 
PDF
SFScon 2020 - Peter Hopfgartner - Open Data de luxe
South Tyrol Free Software Conference
 
PPTX
Why do they call it Linked Data when they want to say...?
Oscar Corcho
 
Querying Linked Data with SPARQL (2010)
Olaf Hartig
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Olaf Hartig
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Olaf Hartig
 
Consuming Linked Data 4/5 Semtech2011
Juan Sequeda
 
Answers to usual issues in getting started with consuming Linked Data
Olaf Hartig
 
Madrid SPARQL handson
Victor de Boer
 
Answers to usual issues in getting started with consuming Linked Data (2010)
Olaf Hartig
 
Re-using Media on the Web: Media fragment re-mixing and playout
MediaMixerCommunity
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Olaf Hartig
 
Visualize open data with Plone - eea.daviz PLOG 2013
Antonio De Marinis
 
Hack U Barcelona 2011
Peter Mika
 
SWT Lecture Session 4 - SW architectures and SPARQL
Mariano Rodriguez-Muro
 
State of the Semantic Web
Ivan Herman
 
SPARQL 1.1 Update (2013-03-05)
andyseaborne
 
GDG Meets U event - Big data & Wikidata - no lies codelab
CAMELIA BOBAN
 
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
net2-project
 
SFScon 2020 - Peter Hopfgartner - Open Data de luxe
South Tyrol Free Software Conference
 
Why do they call it Linked Data when they want to say...?
Oscar Corcho
 
Ad

More from Olaf Hartig (20)

PDF
LDQL: A Query Language for the Web of Linked Data
Olaf Hartig
 
PDF
A Context-Based Semantics for SPARQL Property Paths over the Web
Olaf Hartig
 
PDF
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
Olaf Hartig
 
PDF
Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...
Olaf Hartig
 
PDF
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
Olaf Hartig
 
PDF
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
Olaf Hartig
 
PDF
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Olaf Hartig
 
PDF
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
Olaf Hartig
 
ODP
An Overview on PROV-AQ: Provenance Access and Query
Olaf Hartig
 
PDF
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
Olaf Hartig
 
PDF
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...
Olaf Hartig
 
PDF
The Impact of Data Caching of on Query Execution for Linked Data
Olaf Hartig
 
PDF
How Caching Improves Efficiency and Result Completeness for Querying Linked Data
Olaf Hartig
 
PDF
A Main Memory Index Structure to Query Linked Data
Olaf Hartig
 
PDF
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Olaf Hartig
 
PDF
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
Olaf Hartig
 
PDF
Linked Data on the Web
Olaf Hartig
 
PDF
Executing SPARQL Queries of the Web of Linked Data
Olaf Hartig
 
PDF
Using Web Data Provenance for Quality Assessment
Olaf Hartig
 
PDF
Querying Trust in RDF Data with tSPARQL
Olaf Hartig
 
LDQL: A Query Language for the Web of Linked Data
Olaf Hartig
 
A Context-Based Semantics for SPARQL Property Paths over the Web
Olaf Hartig
 
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
Olaf Hartig
 
An Overview on PROV-AQ: Provenance Access and Query
Olaf Hartig
 
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
Olaf Hartig
 
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...
Olaf Hartig
 
The Impact of Data Caching of on Query Execution for Linked Data
Olaf Hartig
 
How Caching Improves Efficiency and Result Completeness for Querying Linked Data
Olaf Hartig
 
A Main Memory Index Structure to Query Linked Data
Olaf Hartig
 
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Olaf Hartig
 
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
Olaf Hartig
 
Linked Data on the Web
Olaf Hartig
 
Executing SPARQL Queries of the Web of Linked Data
Olaf Hartig
 
Using Web Data Provenance for Quality Assessment
Olaf Hartig
 
Querying Trust in RDF Data with tSPARQL
Olaf Hartig
 
Ad

Recently uploaded (20)

PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
PPTX
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PPTX
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
PPT
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PDF
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 

Querying Linked Data with SPARQL

  • 1. Querying Linked Data with SPARQL ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 2. Brief Introduction to SPARQL ● SPARQL: Query Language for RDF data ● Main idea: pattern matching ● Describe subgraphs of the queried RDF graph ● Subgraphs that match your description yield a result ● Mean: graph patterns (i.e. RDF graphs /w variables) ?v rdf:type http://.../Volcano ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 3. Brief Introduction to SPARQL Queried graph: rdf:type http://.../Mount_Baker http://.../Volcano p:lastEruption rdf:type "1880" htp://.../Mount_Etna ?v rdf:type Results: http://.../Volcano ?v http://.../Mount_Baker http://.../Mount_Etna ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 4. SPARQL Endpoints ● Linked data sources usually provide a SPARQL endpoint for their dataset(s) ● SPARQL endpoint: SPARQL query processing service that supports the SPARQL protocol* ● Send your SPARQL query, receive the result * http://www.w3.org/TR/rdf-sparql-protocol/ ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 5. SPARQL Endpoints Data Source Endpoint Address DBpedia http://dbpedia.org/sparql Musicbrainz http://dbtune.org/musicbrainz/sparql U.S. Census http://www.rdfabout.com/sparql Semantic Crunchbase http://cb.semsol.org/sparql More complete list: http://esw.w3.org/topic/SparqlEndpoints ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 6. Accessing a SPARQL Endpoint ● SPARQL endpoints: RESTful Web services ● Issuing SPARQL queries to a remote SPARQL endpoint is basically an HTTP GET request to the SPARQL endpoint with parameter query GET /sparql?query=PREFIX+rd... HTTP/1.1 Host: dbpedia.org User-agent: my-sparql-client/0.1 URL-encoded string with the SPARQL query ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 7. Query Results Formats ● SPARQL endpoints usually support different result formats: ● XML, JSON, plain text (for ASK and SELECT queries) ● RDF/XML, NTriples, Turtle, N3 (for DESCRIBE and CONSTRUCT queries) ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 8. Query Results Formats PREFIX dbp: <http://dbpedia.org/ontology/> PREFIX dbpprop: <http://dbpedia.org/property/> SELECT ?name ?bday WHERE { ?p dbp:birthplace <http://dbpedia.org/resource/Berlin> ; dbpprop:dateOfBirth ?bday ; dbpprop:name ?name . } name | bday ------------------------+------------ Alexander von Humboldt | 1769-09-14 Ernst Lubitsch | 1892-01-28 ... ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 9. <?xml version="1.0"?> <sparql xmlns="http://www.w3.org/2005/sparql-results#"> <head> <variable name="name"/> <variable name="bday"/> </head> <results distinct="false" ordered="true"> <result> <binding name="name"> <literal xml:lang="en">Alexander von Humboldt</literal> </binding> <binding name="bday"> <literal datatype="http://www.w3.org/2001/XMLSchema#date">1769-09-14</literal> </binding> </result> <result> <binding name="name"> <literal xml:lang="en">Ernst Lubitsch</literal> </binding> <binding name="bday"> <literal datatype="http://www.w3.org/2001/XMLSchema#date">1892-01-28</literal> </binding> </result> http://www.w3.org/TR/rdf-sparql-XMLres/ <!-- … --> </results> ISWC 2009 Tutorial "How to Consume Linked Data on the Web" </sparql>
  • 10. { "head": { "link": [], "vars": ["name", "bday"] }, "results": { "distinct": false, "ordered": true, "bindings": [ { "name": { "type": "literal", "xml:lang": "en", "value": "Alexander von Humboldt" } , "bday": { "type": "typed-literal", "datatype": "http://www.w3.org/2001/XMLSchema#date", "value": "1769-09-14" } }, { "name": { "type": "literal", "xml:lang": "en", "value": "Ernst Lubitsch" } , "bday": { "type": "typed-literal", "datatype": "http://www.w3.org/2001/XMLSchema#date", "value": "1892-01-28" } }, // ... ] } http://www.w3.org/TR/rdf-sparql-json-res/ } ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 11. Query Result Formats ● Use the ACCEPT header to request the preferred result format: GET /sparql?query=PREFIX+rd... HTTP/1.1 Host: dbpedia.org User-agent: my-sparql-client/0.1 Accept: application/sparql-results+json ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 12. Query Result Formats ● As an alternative some SPARQL endpoint implementations (e.g. Joseki) provide an additional parameter out GET /sparql?out=json&query=... HTTP/1.1 Host: dbpedia.org User-agent: my-sparql-client/0.1 ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 13. Accessing a SPARQL Endpoint ● More convenient: use a library ● Libraries: ● SPARQL JavaScript Library http://www.thefigtrees.net/lee/blog/2006/04/sparql_calendar_demo_a_sparql.html ● ARC for PHP http://arc.semsol.org/ ● RAP – RDF API for PHP http://www4.wiwiss.fu-berlin.de/bizer/rdfapi/index.html ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 14. Accessing a SPARQL Endpoint ● Libraries (cont.): ● Jena / ARQ (Java) http://jena.sourceforge.net/ ● Sesame (Java) http://www.openrdf.org/ ● SPARQL Wrapper (Python) http://sparql-wrapper.sourceforge.net/ ● PySPARQL (Python) http://code.google.com/p/pysparql/ ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 15. Accessing a SPARQL Endpoint ● Example with Jena / ARQ: import com.hp.hpl.jena.query.*; String service = "..."; // address of the SPARQL endpoint String query = "SELECT ..."; // your SPARQL query QueryExecution e = QueryExecutionFactory.sparqlService( service, query ); ResultSet results = e.execSelect(); while ( results.hasNext() ) { QuerySolution s = results.nextSolution(); // … } e.close(); ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 16. Querying a single dataset is quite boring compared to: ● Issuing SPARQL queries over multiple datasets ● How can you do this? 1. Issue follow-up queries to different endpoints 2. Querying a central collection of datasets 3. Build store with copies of relevant datasets 4. Use query federation system ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 17. Follow-up Queries ● Idea: issue follow-up queries over other datasets based on results from previous queries ● Substituting placeholders in query templates ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 18. String s1 = "http://cb.semsol.org/sparql"; String s2 = "http://dbpedia.org/sparql"; String qTmpl = "SELECT ?c WHERE{ <%s> rdfs:comment ?c }"; String q1 = "SELECT ?s WHERE { ..."; QueryExecution e1 = QueryExecutionFactory.sparqlService(s1,q1); ResultSet results1 = e1.execSelect(); while ( results1.hasNext() ) { QuerySolution s1 = results.nextSolution(); String q2 = String.format( qTmpl, s1.getResource("s"),getURI() ); QueryExecution e2= QueryExecutionFactory.sparqlService(s2,q2); ResultSet results2 = e2.execSelect(); while ( results2.hasNext() ) { // ... } Find a list of companies e2.close(); } filtered by some criteria and e1.close(); return DBpedia URIs of them ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 19. Follow-up Queries ● Advantage: ● Queried data is up-to-date ● Drawbacks: ● Requires the existence of a SPARQL endpoint for each dataset ● Requires program logic ● Very inefficient ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 20. Querying a Collection of Datasets ● Idea: Use an existing SPARQL endpoint that provides access to a set of copies of relevant datasets ● Example: ● SPARQL endpoint by OpenLink SW over a majority of datasets from the LOD cloud at: http://lod.openlinksw.com/sparql ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 21. Querying a Collection of Datasets ● Advantage: ● No need for specific program logic ● Drawbacks: ● Queried data might be out of date ● Not all relevant datasets in the collection ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 22. Own Store of Dataset Copies ● Idea: Build your own store with copies of relevant datasets and query it ● Possible stores: ● Jena TDB http://jena.hpl.hp.com/wiki/TDB ● Sesame http://www.openrdf.org/ ● OpenLink Virtuoso http://virtuoso.openlinksw.com/ ● 4store http://4store.org/ ● AllegroGraph http://www.franz.com/agraph/ ● etc. ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 23. Own Store of Dataset Copies ● Advantages: ● No need for specific program logic ● Can include all datasets ● Independent of the existence, availability, and efficiency of SPARQL endpoints ● Drawbacks: ● Requires effort to set up and to operate the store ● Ideally, data sources provide RDF dumps; if not? ● How to keep the copies in sync with the originals? ● Queried data might be out of date ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 24. Federated Query Processing ● Idea: Querying a mediator which ? distributes subqueries to relevant sources and integrates the results ? ? ? ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 25. Federated Query Processing ● Instance-based federation ● Each thing described by only one data source ● Untypical for the Web of Data ● Triple-based federation ● No restrictions ● Requires more distributed joins ● Statistics about datasets requires (both cases) ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 26. Federated Query Processing ● DARQ (Distributed ARQ) http://darq.sourceforge.net/ ● Query engine for federated SPARQL queries ● Extension of ARQ (query engine for Jena) ● Last update: June 28, 2006 ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 27. Federated Query Processing ● Semantic Web Integrator and Query Engine (SemWIQ) http://semwiq.sourceforge.net/ ● Actively maintained by Andreas Langegger ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 28. Federated Query Processing ● Advantages: ● No need for specific program logic ● Queried data is up to date ● Drawbacks: ● Requires the existence of a SPARQL endpoint for each dataset ● Requires effort to set up and configure the mediator ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 29. In any case: ● You have to know the relevant data sources ● When developing the app using follow-up queries ● When selecting an existing SPARQL endpoint over a collection of dataset copies ● When setting up your own store with a collection of dataset copies ● When configuring your query federation system ● You restrict yourself to the selected sources ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 30. In any case: ● You have to know the relevant data sources ● When developing the app using follow-up queries ● When selecting an existing SPARQL endpoint over a collection of dataset copies ● When setting up your own store with a collection of dataset copies ● When configuring your query federation system ● You restrict yourself to the selected sources There is an alternative: Remember, URIs link to data ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 31. Automated Link Traversal ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 32. Automated Link Traversal ● Idea: Discover further data by looking-up relevant URIs in your application ● Can be combined with the previous approaches ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 33. Link Traversal Based Query Execution ● Applies the idea of automated link traversal to the execution of SPARQL queries ● Idea: ● Intertwine query evaluation with traversal of RDF links ● Discover data that might contribute to query results during query execution ● Alternately: ● Evaluate parts of the query ● Look up URIs in intermediate solutions Queried data ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 34. Link Traversal Based Query Execution SELECT ?c ?u WHERE { <http://mymovie.db/movie2449> mov:filming_location ?c . ?c geo:statistics ?cStats . ?cStats stat:unempRate ?u . } ● Example: Return unemployment rate of the countries in which the movie http://mymovie.db/movie2449 was filmed. Queried data ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 35. Link Traversal Based Query Execution SELECT ?c ?u WHERE { <http://mymovie.db/movie2449> mov:filming_location ?c . ?c geo:statistics ?cStats . 49 v ie24 ?cStats stat:unempRate ?u . } .d b/mo m ovie http ://my ? Queried data ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 36. Link Traversal Based Query Execution SELECT ?c ?u WHERE { <http://mymovie.db/movie2449> mov:filming_location ?c . ?c geo:statistics ?cStats . ?cStats stat:unempRate ?u . } Queried data ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 37. Link Traversal Based Query Execution SELECT ?c ?u WHERE { <http://mymovie.db/movie2449> mov:filming_location ?c . ?c geo:statistics ?cStats . ?cStats stat:unempRate ?u . } ... <http://mymovie.db/movie2449> mov:filming_location <http://geo.../Italy> . Queried data ... ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 38. Link Traversal Based Query Execution SELECT ?c ?u WHERE { <http://mymovie.db/movie2449> mov:filming_location ?c . ?c geo:statistics ?cStats . ?loc ?cStats stat:unempRate ?u . } http://geo.../Italy ... <http://mymovie.db/movie2449> mov:filming_location <http://geo.../Italy> . Queried data ... ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 39. Link Traversal Based Query Execution SELECT ?c ?u WHERE { <http://mymovie.db/movie2449> mov:filming_location ?c . ?c geo:statistics ?cStats . ?loc ?cStats stat:unempRate ?u . } http://geo.../Italy taly o.../I / / ge ? http: Queried data ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 40. Link Traversal Based Query Execution SELECT ?c ?u WHERE { <http://mymovie.db/movie2449> mov:filming_location ?c . ?c geo:statistics ?cStats . ?loc ?cStats stat:unempRate ?u . } ly http://geo.../Italy eo .../Ita http://g ? Queried data ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 41. Link Traversal Based Query Execution SELECT ?c ?u WHERE { <http://mymovie.db/movie2449> mov:filming_location ?c . ?c geo:statistics ?cStats . ?loc ?cStats stat:unempRate ?u . } http://geo.../Italy Queried data ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 42. Link Traversal Based Query Execution SELECT ?c ?u WHERE { <http://mymovie.db/movie2449> mov:filming_location ?c . ?c geo:statistics ?cStats . ?loc ?cStats stat:unempRate ?u . } http://geo.../Italy ... <http://geo.../Italy> geo:statistics <http://example.db/stat/IT> . ... Queried data ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 43. Link Traversal Based Query Execution SELECT ?c ?u WHERE { <http://mymovie.db/movie2449> mov:filming_location ?c . ?c geo:statistics ?cStats . ?loc ?cStats stat:unempRate ?u . } http://geo.../Italy ?loc ?stat http://geo.../Italy http://stats.db/../it ... <http://geo.../Italy> geo:statistics <http://example.db/stat/IT> . ... Queried data ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 44. Link Traversal Based Query Execution SELECT ?c ?u WHERE { <http://mymovie.db/movie2449> mov:filming_location ?c . ?c geo:statistics ?cStats . ?loc ?cStats stat:unempRate ?u . } http://geo.../Italy ?loc ?stat http://geo.../Italy http://stats.db/../it ● Proceed with this strategy (traverse RDF links during query execution) Queried data ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 45. Link Traversal Based Query Execution ● Advantages: ● No need to know all data sources in advance ● No need for specific programming logic ● Queried data is up to date ● Independent of the existence of SPARQL endpoints provided by the data sources ● Drawbacks: ● Not as fast as a centralized collection of copies ● Unsuitable for some queries ● Results might be incomplete ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 46. Implementations ● Semantic Web Client library (SWClLib) for Java http://www4.wiwiss.fu-berlin.de/bizer/ng4j/semwebclient/ ● SWIC for Prolog http://moustaki.org/swic/ ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 47. Implementations ● SQUIN http://squin.org ● Provides SWClLib functionality as a Web service ● Accessible like a SPARQL endpoint ● Public SQUIN service at: http://squin.informatik.hu-berlin.de/SQUIN/ ● Install package: unzip and start ● Convenient access with SQUIN PHP tools: $s = 'http:// …'; // address of the SQUIN service $q = new SparqlQuerySock( $s, '… SELECT ...' ); $res = $q->getJsonResult(); // or getXmlResult() ISWC 2009 Tutorial "How to Consume Linked Data on the Web"
  • 48. Real-World Examples SELECT DISTINCT ?author ?phone WHERE { ?pub swc:isPartOf <http://data.semanticweb.org/conference/eswc/2009/proceedings> . ?pub swc:hasTopic ?topic . ?topic rdfs:label ?topicLabel . FILTER regex( str(?topicLabel), "ontology engineering", "i" ) . # of query results 2 ?pub swrc:author ?author . # of retrieved graphs 297 { ?author owl:sameAs ?authorAlt } # of accessed servers 16 UNION avg. execution time 1min 30sec { ?authorAlt owl:sameAs ?author } Return ?authorAlt foaf:phone ?phone . phone numbers of authors of ontology engineering papers } at ESWC'09. ISWC 2009 Tutorial "How to Consume Linked Data on the Web"