SlideShare a Scribd company logo
Realizing a Semantic Web Application Emanuele Della Valle Dario Cerizza Irene Celino http://www.cefriel.it    http://swa.cefriel.it     [email_address]   http://emanueledellavalle.org 7 th  Int. Semantic Web Conference  ISWC 2008 Karlsruhe, Germany, October 26, 2008 C enter of  E xcellence  F or  R esearch,  I nnovation,  E ducation and industrial  L ab partnership - Politecnico di Milano
Goal We will “develop” together an application of the Semantic Web we named  Music Event Explorer  or simply  meex We will challenge the Semantic Web technologies in realizing a new service for Web users Using Transforming and Combining existing data
Ingredients RDF  as unified data model OWL  as modelling language for the data sources GRDDL  as a standard approach to translate in RDF the data stored in XML data sources D2RQ  as tool to translate in RDF the data stored in relational data sources SPARQL  as standard query language to access RDF data Jena  as application framework to merge the various data in a single RDF model and manipulate it Joseky  as tool to expose SPARQL endpoint ARQ  as SPARQL client library A  RDF storage  to guarantee persistency A  OWL reasoner  to infer new knowledge Exhibit  as user interface
Approach In order to realize meex  We start from the user need  We derive user requirements We develop the ontologies and the software components While presenting we will explain the use of Semantic Web technologies and tools. (green background slides) A demonstrative installation of the application, together with the source code, is available at http://swa.cefriel.it/meex
Towards a Semantic Web Ivan Herman in introducing the Semantic Web* explains: the current Web represents information using natural language (English, Hungarian, Chinese,…) graphics, multimedia, page layout Humans can process this easily can deduce facts from partial information can create mental associations are used to various sensory information (well, sort of… people with disabilities may have serious problems on the Web with rich media!) *  http://www.w3.org/People/Ivan/CorePresentations/IntroThroughExample/
Towards a Semantic Web Tasks often require to combine data on the Web: hotel and travel information may come from different sites searches in different digital libraries etc. Again, humans combine these information easily even if different terminology's are used!
Towards a Semantic Web However: machines are ignorant! partial information is unusable difficult to make sense from, e.g., an image drawing analogies automatically is difficult difficult to combine information automatically is  <foo:creator> same as <bar:author>? how to combine different XML hierarchies? …
A user need for meex Imagine the users need to explore music events related to a given music style An event is a concert, a show or a workshop at which one or more artist participate.  An artist is either a single musician or a band.  For instance, if a user is interest in Folk music meex finds the artists that play Folk music  searches for events of those artists allows the users to explore the events related to each artist as a list, on a time line and on a map
A manual solution I open musicmoz [1] and I look up artists that play Folk music If the pages of the artists on musicmoz don’t satisfy me I navigate to musicbrainz [2]  I look up in EVDB [3] if some of those artists have organized an event close to my location in these days I take note of the possible alternatives and I check how to get there using google maps [4] [1]  http://www.musicmoz.org   [2]  http://www.musicbrainz.org   [3]  http://www.eventful.com   [4]  http://maps.google.com
A manual solution I look up artists that play Folk music
A manual solution I can learn more navigating to musicbrainz
A manual solution I look up in EVDB if some of those artists have organized an event close to my location in these days
A manual solution I take note of the possible alternatives and I check how to get there using google maps
Music Event Explorer Of course I can do it manually, but I need the time to do so. Can’t I write a mash-up?
What is needed? Ivan Herman in introducing the Semantic Web* explains (Some) data should be available for machines for further  processing Data should be possibly combined, merged on a Web scale Sometimes, data may describe other data (like the library  example, using metadata)… …  but sometimes the data is to be exchanged by itself, like my calendar or my travel preferences Machines may also need to  reason  about that data *  http://www.w3.org/People/Ivan/CorePresentations/IntroThroughExample/
The rough structure of data integration Map the various data onto an abstract data representation make the data independent of its internal representation… Merge the resulting representations Start making queries on the whole! queries that could not have been done on the individual data sets
The rough structure of data integration
So where is the Semantic Web? The Semantic Web provides technologies to make such  integration possible! For example: an  abstract model  for the relational graphs:  RDF extract  RDF information from XML (eg, XHTML) pages:  GRDDL add  structured information to XHTML pages:  RDFa a  query  language adapted for the relational graphs:  SPARQL characterize  the relationships, categorize resources:  RDFS ,  OWL ,  SKOS ,  Rules applications may choose among the different technologies some of them may be relatively simple with simple tools (RDFS), whereas some require sophisticated systems (OWL, Rules) reuse  of existing “ontologies” that others have produced
So where is the Semantic Web?
A Semantic Web application is still an application! A Semantic Web application is still an application, thus we need to follow good practice from Software Engineering in developing it. We adopt a Spiral Model inspired by the famous Boehm spiral model We extend it with Knowledge Engineering practices
D.1  Model the application ontology D.2  Model the content ontology R.1  Users’ needs analysis R.3  Software requirements analysis R.4  Content requirements analysis D.3  Model sample contents Reuse Merge Extend I.1  Implement the initial Knowledge Base V.1   Validation I.3  Choose content annotation methods I.2  Implement the integrated model Reuse Merge Extend I.4  Implement the application R.2  Risk analysis D.4  Design Application T.1   Testing
D.1  Model the application ontology D.2  Model the content ontology R.1  Users’ needs analysis R.3  Software requirements analysis R.4  Content requirements analysis D.3  Model sample contents Reuse Merge Extend I.1  Implement the initial Knowledge Base V.1   Validation I.3  Choose content annotation methods I.2  Implement the integrated model Reuse Merge Extend I.4  Implement the application R.2  Risk analysis D.4  Design Application T.1   Testing
D.1  Model the application ontology D.2  Model the content ontology R.1  Users’ needs analysis R.3  Software requirements analysis R.4  Content requirements analysis D.3  Model sample contents Reuse Merge Extend I.1  Implement the initial Knowledge Base V.1   Validation I.3  Choose content annotation methods I.2  Implement the integrated model Reuse Merge Extend I.4  Implement the application R.2  Risk analysis D.4  Design Application T.1   Testing
Content requirements analysis Given we are developing a Semantic Web application is cruscial we reuse data already available on the Web EVDB - http://eventuful.com  MusicBrainz - http://musicbrainz.org MusicMoz - http://musicmoz.org
EVDB EVDB is a Web 2.0 website that makes available information about event all around the world For each event it knows The start data The end data The place in terms of address and geographic coordinates EVDB offers a Web API in the form of a REST service see  http://api.evdb.com
MusicBrainz MusicBrainz  is a Web 2.0 website that gathered a large amount of information about music offers information about artists and bands songs, albums and tracks relations among artists and bands The data of MusicBrainz are available as a PostgreSQL dump see  http://musicbrainz.org/doc/DatabaseDownload
MusicMoz MusicMoz  is another Web 2.0 website dedicated to music offers information about artists and bands including their nationality music styles and their taxonomic relationships the styles each artist or band plays  reuses MusicBrainz identifier for artists and bands The data of MusicMoz are available as large XML files see  http://musicmoz.org/xml/
meex needs to merge this data meex in order to be able to manipulate all this data at the same time needs to merge the data of the three data sources. The artists and bands information from MusicBrainz should be linked to the music styles they play from MusicMoz the events related to them from EVDB
Data Licences The data of all three data sources are freely usable, we just need to make sure that the logos of the three applications appears on each page of meex EVDB requests also to include a link to the permalink of the event on EVDB website MusicBrainz request also that derived data are made available in Creative Commons. Read out more here EVDB -  http://api.eventful.com/terms   MusicMoz -  http://musicmoz.org/xml/license.html   MusicBrainz -  http://musicbrainz.org/doc/DatabaseDownload
Application requirements analysis (1) In this step (namely R.3) we should elicit functional requirements of the application as grouping and filtering data non-functional requirements of the application as performance and scalability w.r.t. number of users However this is just a tutorial, therefore we concentrate on functional requirements, leaving non-functional requirements underspecified
Application requirements analysis (2) Meex must enable a user to explore data in the form of a list a chronological graphic a geographic map for each event must show name begin and end date place for each artist must show name nationality music styles he/she plays related artists must allow users to filter and rank results
D.1  Model the application ontology D.2  Model the content ontology R.1 Users’ needs analysis R.3 Software requirements analysis R.4 Content requirements analysis D.3 Model sample contents Reuse Merge Extend I.1 Implement the initial Knowledge Base V.1  Validation I.3 Choose content annotation methods I.2 Implement the integrated model Reuse Merge Extend I.4 Implement the application R.2 Risk analysis D.4 Design Application T.1  Testing
Motivations for RDF Eric Prud'hommeaux explains* Simple, consistent data model Uses web architecture for web scalability Glamorous use cases *  http://www.w3.org/2008/Talks/1027-ISWC/HCLS
What does RDF provide? Common (simple) model to for all data. Incentive and infrastructure to re-use terms when possible and invent terms when necessary. Simple and complex ontological languages (RDFS and OWL). Intuitive re-use of now-familiar web topology. Scalable — partial (monotonic) reasoning allowed. Apps need not be re-written for each extension to a data model.
How do we write RDF? Name resources and relationships with URIs e.g.  http://emanueledellavalle.org/sw/foaf.rdf#me  represents a person Express statements as subject, predicate, object Write the triples in RDF/XML: Standard serialization in XML <Description about=”subject”> <property>value</property> </Description> NTriples: Simple (verbose) reference serialization (for specifications only) ‏ <http:// ... subject> <http:// ... predicate> “value”  . N3 and Turtle: Developer-friendly serializations  :subject :property “value” .
Turtle Introduction RDF triples analogous to one 3-place holds(s, p, o) predicate edv:me  foaf:knows  ic:me . holds(edv:me, foaf:knows, ic:me) ‏ Triples made from standard RDF terms: IRIs : <http://emanueledellavalle.org/sw/foaf.rdf#me>  or  <#me>  or  edv:me Literals:  &quot;Emanuele Della Valle&quot; Typed literals:  &quot;3.14&quot;^^xsd:float Literals with language tags:  &quot; 日本語 &quot;@ja Blank nodes:  []  or  _:bob
Convience Syntax URI terms can be abbreviated using namespaces @prefix edv: <http://emanueledellavalle.org/sw/foaf.rdf#> . @prefix rdf: <http://www.w3.org/1999/ 02/22-rdf-syntax-ns#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> edv:me  rdf:type  foaf:Person . ' a '  =  <http://www.w3.org/1999/ 02/22-rdf-syntax-ns#type> edv:me  a  foaf:Person . In-line blank nodes edv:me foaf:knows [ foaf:name &quot;Irene Celino&quot; ] .
Convience Syntax Abbreviating repeated subjects: edv:me  rdf:type  foaf:Person . edv:me  foaf:knows  ic:me . ... is the same as ... edv:me  rdf:type  foaf:Person ; foaf:knows  ic:me . Abbreviating repeated subject/predicate pairs: edv:me  foaf:knows  ic:me . edv:me  foaf:knows  dc:me . ... is the same as ... edv:me  foaf:knows  ic:me , dc:me .
Artist data in RDF Original XML data as in MusicMoz <category name=&quot;Bands_and_Artists/B/Beatles,_The“   type=&quot;band&quot;> <resource name=&quot;musicbrainz&quot;    link=&quot;http://musicbrainz.org/artist/   b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d.html&quot;/> <style number=&quot;1&quot;>British Invasion</style> <style number=&quot;2&quot;>Rock</style> <style number=&quot;3&quot;>Skiffle</style> </category> The same data represented in RDF in meex mb:artist/b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d.html  a  mb:Band ; rdfs:label  &quot;The Beatles&quot; ;  mm:hasStyle mm:BritishInvation, mm:Rock, mm:Skiffle .
RDF Resources RDF at the W3C  - primer and specifications Semantic Web tools  - community maintained list; includes triple store, programming environments, tool sets, and more 302 Semantic Web Videos and Podcasts  - includes a section specifically on RDF videos
RDFS/OWL in a nutshell: class and instance Creating a class  RDFS:  Artist rdf:type rdfs:Class .  FOL:   x Artist(x) Creating a subclass  RDFS:  Painter rdfs:subClassOf Artist . RDFS:  Sculptor rdfs:subClassOf Artist . FOL:   x [Painter(x)    Sculptor(x)    Artist(x)] Creating an instance RDFS:  Rodin rdf:type Sculptor . FOL:  Sculptor(Rodin) Artist Painter Sculptor Rodin
RDFS/OWL in a nutshell: properties Creating a property RDFS:  creates rdf:type rdf:Property . FOL:   x   y Creates(x,y) Using a property RDFS:  Rodin creates TheKiss . FOL:  Creates(Rodin, TheKiss) Creating subproperties  RDFS:  paints rdfs:subPropertyOf creates . FOL:   x   y [Paints(x,y)    Creates(x,y)] RDFS:  sculpts rdfs:subPropertyOf creates .  FOL:   x   y [Sculpts(x,y)    Creates(x,y)] creates paints
RDFS/OWL in a nutshell: range & domain Checking which classes and predictes can be use together RDFS: creates rdfs:domain Artist . creates rdfs:range Piece . paints rdfs:domain Painter . paints rdfs:range Paint . sculpts rdfs:domain Sculptor . sculpts rdfs:range Sculpt . FOL:  x   y [Crea(x,y)    Artista(x)    Opera(y)]  x   y [Dipinge(x,y)    Pittore(x)    Pittura(y)]  x   y [Scolpisce(x,y)    Scultore(x)    Scultura(y)]
The resulting ontology
Some Inference Rules if   then x rdfs:subClassOf y .   a rdf:type y . a rdf:type x . x rdfs:subClassOf y .   x rdfs:subClassOf z . y rdfs:subClassOf z . x a y .   x b y .  a rdfs:subPropertyOf b . a rdfs:subPropertyOf b . a rdfs:subPropertyOf c . b rdfs:subPropertyOf c . x a y .   x rdf:type z . a rdfs:domain z . x a u .   u rdf:type z . a rdfs:range z .
Inference at work Shared the ontology ... Sculptor rdf:subClassOf Artist . Painter rdf:subClassOf Artist . Painter owl:disjoinWith Sculptor . Sculpt rdf:subClassOf Piece. Painting rdf:subClassOf Piece . Painting owl:disjoinWith Sculpt. creates rdfs:domain Artist . creates rdfs:range Piece. sculpts rdfs:subPropertyOf creates . sculpts rdfs:domain Sculptor . sculpts rdfs:range Sculpt . ... when transmitting the following triple … Rodin sculpts TheKiss . …  the recipient can …
Inference at work …  the recipient can answer the syntactic query Sculpts(Rodin,TheKiss)?  yes but it can also answer queries such as  Sculptor(Rodin)?  yes Artist(Rodin)?  yes Painter(Rodin)?  no Sculpt(TheKiss)?  yes Piece(TheKiss)?  yes Painting(TheKiss)?  no Creates(Rodin,TheKiss)?  yes Painting(Rodin,TheKiss)? no NOTE : The recipient cannot give such answers without sharing the ontology (i.e., if the triple was an XML fragment valid w.r.t. a shared DTD)
Model the Application Ontology As first design step (namely D.1) we model the  application ontology meex must manage information related to artists events at which the artists participate and music styles the artists play
Modeling  Performer  in OWL @prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix owl:  <http://www.w3.org/2002/07/owl#> . @prefix meex: <http://swa.cefriel.it/meex#> . meex:Performer a  owl:Class ;   rdfs:label  &quot;Performer&quot; . meex:fromCountry a owl:DatatypeProperty ;   rdfs:domain meex:Performer ;   rdfs:range    <http://www.w3.org/2001/XMLSchema#string> . meex:relatedPerformer  a  owl:ObjectProperty ;   rdfs:domain  meex:Performer ;   rdfs:range  meex:Performer . [more to follow] Meex.n3
Modeling  Style  in OWL [follows] meex:Style  a  owl:Class .   rdfs:label  &quot;Music Style&quot; . meex:performsStyle  a  owl:ObjectProperty ;   rdfs:domain  meex:Performer ;   rdfs:range  meex:Style . [more to follow] Meex.n3
Modeling  Event  in OWL [follows] meex:Event  a  owl:Class ;   rdfs:label  &quot;Event&quot; . meex:performsEvent  a  owl:ObjectProperty ;   rdfs:domain  meex:Performer ;   rdfs:range  meex:Event . meex:hasWhen  a  owl:ObjectProperty ;   rdfs:domain  meex:Event ;   rdfs:range  gd:When  . meex:hasWhere  a  owl:ObjectProperty ;   rdfs:domain  meex:Event ;   rdfs:range  gd:Where   Meex.n3 For each event we should model begin and end date together with the place, but an XML schema defined by Google exists; thus we decide to reuse it by merging it
Modeling  When  in OWL [namespace declaration] gd:When  a  owl:Class;   rdfs:label  &quot;Time&quot; . gd:startTime  a  owl:DatatypeProperty ;   rdfs:domain  gd:When ;   rdfs:range   <http://www.w3.org/2001/XMLSchema#string> . gd:endTime  a  owl:DatatypeProperty ;   rdfs:domain  gd:When ;   rdfs:range   <http://www.w3.org/2001/XMLSchema#string> . [more to follow] GoogleSchema.n3
Modeling  Where  in OWL gd:Where  a  owl:Class; rdfs:label  &quot;Location&quot; . gd:postalAddress a  owl:DatatypeProperty ;    rdfs:domain  gd:Where ;   rdfs:range    <http://www.w3.org/2001/XMLSchema#string>. gd:hasGeoPt  a  owl:ObjectProperty ;   rdfs:domain  gd:Where ;   rdfs:range  gd:GeoPt . gd:GeoPt a  owl:Class ; rdfs:label  &quot;Geo-referenced Point&quot; . gd:lat  a  owl:DatatypeProperty ;   rdfs:domain  gd:GeoPt ;   rdfs:range <http://www.w3.org/2001/XMLSchema#string>. gd:lon  a  owl:DatatypeProperty ;   rdfs:domain  gd:GeoPt ;   rdfs:range <http://www.w3.org/2001/XMLSchema#string>. gd:label  rdfs:subPropertyOf  rdfs:label .  GoogleSchema.n3
Model the content ontology We keep following our approach and we model the content ontology (step D.2) The content ontology models in OWL the data of the three data sources used by meex In the mean time we also model the sample contents (step D.3) that we will use to test meex during its implementation (see test-first method from Agile manifesto)
D.1  Model the application ontology D.2  Model the content ontology R.1 Users’ needs analysis R.3 Software requirements analysis R.4 Content requirements analysis D.3 Model sample contents Reuse Merge Extend I.1 Implement the initial Knowledge Base V.1  Validation I.3 Choose content annotation methods I.2 Implement the integrated model Reuse Merge Extend I.4 Implement the application R.2 Risk analysis D.4 Design Application T.1  Testing
Modeling MusicBrainz schema in OWL @prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> . @prefix owl:  <http://www.w3.org/2002/07/owl#> . @prefix mb:  <http://musicbrainz.org/> . mb:Artist  a  owl:Class ;   rdfs:label  &quot;MusicBrainz Artist and Band&quot; . mb:artist_relation  a  owl:ObjectProperty ;   rdfs:domain  mb:Artist ;   rdfs:range  mb:Artist . MusicBrainz.n3 artist artist_relation id gid artist ref
Sample data for MusicBrainz in OWL mb:artist/b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d.html  a  mb:Artist ; rdfs:label  &quot;The Beatles&quot; ; mb:related_artist mb:artist/ebfc1398-8d96-47e3-82c3-f782abcdb13d.html , mb:artist/618b6900-0618-4f1e-b835-bccb17f84294.html . mb:artist/ebfc1398-8d96-47e3-82c3-f782abcdb13d.html  a mb:Artist ; rdfs:label  &quot;The Beach Boys&quot; . mb:artist/618b6900-0618-4f1e-b835-bccb17f84294.html  a mb:Artist ; rdfs:label  &quot;Eric Clapton&quot; . SampleInstance-MusicBrainz.n3 Please note that we choose to build the URI using the ID that MusicBrainz uses to identify the artists. This allows for easier reuse of meex data in other applications
MusicMoz schema category from * resource style 1 * name link name string type
Modeling MusicMoz schema in OWL @prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> . @prefix owl:  <http://www.w3.org/2002/07/owl#> . @prefix mm:  <http://musicmoz.org/> . @prefix mb:  <http://musicbrainz.org/> . mm:from  a  owl:DatatypeProperty ;   rdfs:domain  mb:Artist ;   rdfs:range <http://www.w3.org/2001/XMLSchema#string>. mm:Style  a owl:Class ;   rdfs:label  &quot;MusicMoz Music Style&quot; . mm:hasStyle  a  owl:ObjectProperty ;   rdfs:domain  mb:Artist ;   rdfs:range  mm:Style . MusicMoz.n3
Sample data for MusicMoz in OWL mb:artist/b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d.html   mm:from &quot;England&quot; ;   mm:hasStyle mm:style/British-Invasion ,   mm:style/Rock ,   mm:style/Skiffle . mm:style/British-Invasion  a  mm:Style ;   rdfs:label  &quot;British Invasion&quot; . SampleInstance-MusicMoz.n3 Please note that also in this case we use the ID derived from MusicBrainz
Modeling EVDB schema in OWL @prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> . @prefix owl:  <http://www.w3.org/2002/07/owl#> . @prefix evdb:  <http://eventful.com/> . @prefix gd:  <http://schemas.google.com/g/2005> . evdb:Event  a  owl:Class ;   rdfs:label  &quot;Eventful Event&quot; . evdb:hasWhen  a  owl:ObjectProperty ;   rdfs:domain  evdb:Event ;   rdfs:range  gd:When . evdb:hasWhere  a  owl:ObjectProperty ;   rdfs:domain  evdb:Event ;   rdfs:range  gd:Where . EVDB.n3 Please note that we reuse the concepts  When  and  Where  we model in the application ontology by merging Google schema (see  GoogleSchema.n3 ).
Sample data for EVDB in OWL evdb:events/E0-001-008121669-0@2008022719  a  evdb:Event ;  gd:label  &quot;Tell Me Why: A Beatles Commentary&quot; .  evdb:hasWhen evdb:events/E0-001-008121669-0@2008022719_When;  evdb:hasWhere evdb:events/E0-001-008121669-0@2008022719_Where. evdb:events/E0-001-008121669-0@2008022719_When  gd:startTime &quot;2008-02-28&quot; ;  gd:endTime  &quot;2008-02-28&quot; . evdb:events/E0-001-008121669-0@2008022719_Where  gd:hasGeoPt evdb:events/E0-001-008121669-0@2008022719_GeoPt ;  gd:label  &quot;The Wilmington Memorial Library&quot; ;  gd:postalAddress  &quot;175 Middlesex Avenue, Wilmington, USA&quot; . evdb:events/E0-001-008121669-0@2008022719_GeoPt  gd:lat  &quot;42.556943&quot; ;  gd:lon  &quot;-71.165576&quot; . SampleInstance-EVDB.n3
“ Application Connected by Concepts ” artists Music styles events time places Meex ontology MusicBrainz EVDB  MusicMoz Meex
Why SPARQL? Eric Prud'hommeaux explains* that SPARQL is the query language of the Semantic Web.  It lets us: Pull values from structured and semi-structured data Explore data by querying unknown relationships Perform complex joins of disparate databases in a single, simple query Transform RDF data from one vocabulary to another *  http://www.w3.org/2008/Talks/1027-ISWC/HCLS
SELECTing variables SPARQL variables bind to RDF terms Ex.  ?artist, ?album, ?times_platinum Like SQL, we pick the variables we want from a query with a SELECT clause Ex.  SELECT ?artist ?album ?times_platinum A SELECT query results in a table of values: ?artist  ?album  ?times_platinum Michael Jackson  Thriller  27 Led Zeppelin  Led Zeppelin IV  22 Pink Floyd   The Wall  22
Triple patterns A triple pattern is an RDF triple that can have variables in any of the subject, predicate, or object positions. Examples: Find countries and their capital cities: ?country geo:capital ?capital . Given a FOAF URI, find the person's name: edv:me foaf:name ?name . What direct relationships exist between two people? edv:me ?relationship ic:me .
Simple query pattern We can combine more than one triple pattern to retrieve multiple values and easily traverse an RDF graph: Find countries, their capital cities, and their populations: ?country geo:capital ?capital ; geo:population ?population . Given a FOAF URI, find the person's name and friends' names: edv:me foaf:name ?name ; foaf:knows ?friend . ?friend  foaf:name ?friend_name . Retrieve all third-line managers in the company: ?emp hr:managedBy ?first_line . ?first_line hr:managedBy ?second_line . ?second_line hr:managedBy ?third_line .
Result forms Besides selecting tables of values, SPARQL allows three other types of queries: ASK - returns a boolean answering, does the query have any results? CONSTRUCT - uses variable bindings to return new RDF triples DESCRIBE - returns server-determined RDF about the queried resources SELECT and ASK results can be returned as XML or JSON.  CONSTRUCT and DESCRIBE results can be returned via any RDF serialization (e.g. RDF/XML or Turtle).
Protocol Mechanics The SPARQL Protocol is a simple method for asking and answering SPARQL queries over HTTP. A SPARQL URL is built from three parts: The URL of a SPARQL endpoint  e.g.  http://dbpedia.org/sparql ‏ (Optional, as part of the query string) The graphs to be queried against  e.g.  default-graph-uri=http://dbpedia.org (As part of the query string) The query itself  e.g.  Query= SELECT distinct ?x WHERE {?x a <http://umbel.org/umbel/sc/Artist> } This is the resulting URL http://dbpedia.org/sparql?default-graph-uri=https%3A%2F%2Fblue-sea-697d.quartiers047.workers.dev%3A443%2Fhttp%2Fdbpedia.org&query=SELECT+distinct+%3Fx+WHERE+{%3Fx+a+%3Chttps%3A%2F%2Fblue-sea-697d.quartiers047.workers.dev%3A443%2Fhttp%2Fumbel.org%2Fumbel%2Fsc%2FArtist%3E+}
SPARQL Resources SPARQL Frequently Asked Questions SPARQL implementations  - community maintained list of open-source and commercial SPARQL engines Public SPARQL endpoints  - community maintained list SPARQL extensions  - collection of SPARQL extensions implemented in various SPARQL engines
D.1  Model the application ontology D.2  Model the content ontology R.1  Users’ needs analysis R.3  Software requirements analysis R.4 Content requirements analysis D.3  Model sample contents Reuse Merge Extend I.1  Implement the initial Knowledge Base V.1   Validation I.3  Choose content annotation methods I.2  Implement the integrated model Reuse Merge Extend I.4  Implement the application R.2  Risk analysis D.4  Design Application T.1  Testing
Summary We are done with the modeling of ontologies and sample contents We can now design meex (step D.4 of our approach) In order to design meex architecture We first  design  its  interfaces  in terms of both graphic user interface  and connection to the three data sources Secondly we  design  how it works  inside  in terms of   components and execution semantics
meex interfaces MusicBrainz database Adapter Database    RDF SPARQL Server EVDB REST service MusicMoz File XML meex XML Browser Web 3) HTML and RDF 2) RDF GRDDL processor EVDB     RDF MusicMoz    RDF XML 2) RDF 1 )  Music style User
How we access the data In order to get RDF data out from the three external data source we can use different techniques For  MusicBrainz  database we can use tools that enable to query non-RDF databases as virtual RDF graphs using a standard SPARQL endpoint  For  MusicMoz  XML files we can use a GRDDL processor using the XSLT  MusicMoz->RDF   For  EVDB  we can use a GRDDL processor applying the XSLT  EVDB->RDF  to the XML file obtained using the EVDB REST service
User Interface In order to collect users’ input and to present results back to the users, we can use Web 2.0 technologies and develop an AJAX interface Such AJAX interface must allow for Inserting the music style, the resulting events will refer to Exploring the events found by meex Filtering the events based on Artists Their nationality The music style they play
Designing how meex works inside Ajax Web Framework  GRDDL Processor For each Artist SPARQL Client MusicBrainz SPARQL Endpoint HTTP REST Client EVDB  HTTP REST service GRDDL Processor EVDB     RDF MusicMoz    RDF Linking Artists to Events RDF Merge Extraction and Transformation Ajax Web Framework  Music style Set of artist in RDF Artist SPARQL Query Events in XML Events in RDF Artists and events in RDF Artist data in RDF HTTP Query Dati RDF Artists and events in RDF
Execution Semantics (1) The user requests a music style meex access the local copy of MusicMoz and using the GRDDL processors obtains a set of artist that plays the given music style [more to follow]
Execution Semantics (2) [follows] For each artist meex : uses the SPARQL client to query the MusicBrainz SPARQL endpoint and it obtains the artist name and his/her relationships with other artist invokes the EVDB REST service, it obtains the events that refer to the artist in XML and uses the GRDDL processor to obtain this data in RDF links the data about each artist to the data about the events that refers to him/her [more to follow]
Execution Semantics (3) [follows] When all the peaces of information about artists and events are available in the RDF storage, meex extracts them and serializes them in the format of the Ajax Web framework The ajax Web framework allows the user for exploring the events found by meex When the user decides to start a new exploration, meex starts over from the beginning
Two important internal components The  RDF storage  must be initialized with both the application and the content ontology is filled in with the data meex loads from the three data source given the music style requested by the user The  reasoner allows all query in meex to be express in terms of the application ontology even if data are loaded from the data sources using the content ontology NOTE: the reasoner support the semantic integration of the data loaded from the external data sources. The meex’s programmer can ignore that multiple and heterogeneous data sources were used to load data
D.1  Model the application ontology D.2  Model the content ontology R.1  Users’ needs analysis R.3  Software requirements analysis R.4 Content requirements analysis D.3  Model sample contents Reuse Merge Extend I.1  Implement the initial Knowledge Base V.1   Validation I.3  Choose content annotation methods I.2  Implement the integrated model Reuse Merge Extend I.4  Implement the application R.2  Risk analysis D.4  Design Application T.1  Testing
Implement the initial Knowledge Base  (1) We start implementing meex by setting up the initial knowledge base (step I.1) We need to select tools to read and write RDF in the RDF/XML and RDF/N3 syntax to manipulate programmatically RDF to store RDF  to reason on OWL to interpret SPARQL
Implement the initial Knowledge Base  (2) We choose Jena because offers API  to read and write different RDF syntax provides a programmatic environment for RDF, RDFS and OWL, SPARQL a  guarantees RDF model persistence through several relational database adapters includes a rule-based inference engine which implement OWL semantics includes ARQ, a query engine that supports SPARQL  In order to use the RDF storage and the OWL reasoner from Jena we need to configure them as shown in the following slides
Configuring the RDF storage Class.forName(&quot;org.apache.derby.jdbc.EmbeddedDriver&quot;); DBConnection con = new DBConnection(   &quot;jdbc:derby:C:/Meex/RDFStorage;create=true&quot;,   &quot;sa&quot;, &quot;&quot;, &quot;Derby&quot;); Model model =    ModelFactory.createModelRDBMaker(con).   createDefaultModel(); We choose to use Derby (from Apache) as relational database underneath the RDF storage.  With row 1 we tell Jena where to find the JDBC driver  With row 2 we define the JDBC connection With row 3 we instantiate the object  model  of Jena we will use to access and manipulate the RDF model in the storage
Configuring the OWL reasoner Reasoner reasoner = ReasonerRegistry.getOWLMicroReasoner(); model = ModelFactory.createInfModel(reasoner, model); Jena offers numerous options to configure the internal  rule-based inference engine with different expressivity-performance  tradeoffs We need simple reasoning features (i.e., subClassOf and subPropertyOf transitive closure), the OWL Micro configuration is, therefore, the most appropriate one With row 1 we instantiate a OWL micro reasoner With row 2 we instantiate a model with inference support using the model previously created and the OWL micro reasoner
D.1  Model the application ontology D.2  Model the content ontology R.1  Users’ needs analysis R.3  Software requirements analysis R.4 Content requirements analysis D.3  Model sample contents Reuse Merge Extend I.1  Implement the initial Knowledge Base V.1   Validation I.3  Choose content annotation methods I.2  Implement the integrated model Reuse Merge Extend I.4  Implement the application R.2  Risk analysis D.4  Design Application T.1  Testing
Implement the integrated model  (1) We move on with the implementation of meex realizing the integrated model (step I.2) In the integrated model we merge application and content ontology Our intent is to integrate semantically the heterogeneous data coming from the external data sources In order to realize the integrated model we need to define a  bridge ontology   using the properties rdfs:subclassOf rdfs:subpropertyOf to connect classes and properties in the application ontology to those in the content ontology
Implement the integrated model  (2) mb:Artist rdfs:subClassOf meex:Performer . mb:related_artist rdfs:subPropertyOf meex:relatedPerformer. mm:Style rdfs:subClassOf meex:Style . mm:hasStyle rdfs:subPropertyOf meex:performsStyle . mm:from rdfs:subPropertyOf meex:fromCountry . evdb:Event rdfs:subClassOf meex:Event. evdb:hasWhen rdfs:subPropertyOf meex:hasWhen. evdb:hasWhere rdfs:subPropertyOf meex:hasWhere. In rows 1 and 2 we connect the ontology of MusicBrainz to the application ontology, i.e. the classes  mb:Artist  and  meex:Performer the properties  mb:related_artist  and  meex:relatedPerformer .  Likewise, in rows 3, 4 and 5, we connect the ontology of MusicMoz to the application ontology and in rows 6, 7 and 8 we connect the ontology of EVDB to the application ontology
Implement the integrated model  (3) Thanks to this bridge ontology, when data loaded from the external data sources are inserted in the RDF storage (using the data source specific ontologies), the OWL micro reasoner infers the triples that represent the same data in the application ontology meex can, therefore, query the RDF storage homogeneously in the terms of application ontology without caring of the heterogeneous formats of the three data sources To give an idea of the differences, in the next slide we compare the data expressed in MusicBrainz ontology and in the application ontology
Implement the integrated model  (4) mb:artist/b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d.html  a  mb:Artist ; rdfs:label  &quot;The Beatles&quot; ; mb:related_artist mb:artist/ebfc1398-8d96-47e3-82c3-f782abcdb13d.html, mb:artist/618b6900-0618-4f1e-b835-bccb17f84294.html. SampleInstance-MusicBrainz.n3 mb:artist/b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d.html a meex:Performer ; rdfs:label  &quot;The Beatles&quot; ; meex:relatedPerformer   mb:artist/ebfc1398-8d96-47e3-82c3-f782abcdb13d.html ,   mb:artist/618b6900-0618-4f1e-b835-bccb17f84294.html . Data-in-MusicBrainz-inferred-using-the-bridge-ontology.n3
Implement the integrated model  (5) model.read(&quot;Meex.n3&quot;, &quot;&quot;, &quot;N3&quot;); model.read(&quot;Google.n3&quot;, &quot;&quot;, &quot;N3&quot;); model.read(&quot;MeexBindings.n3&quot;, &quot;&quot;, &quot;N3&quot;); model.read(&quot;MusicBrainz.n3&quot;, &quot;&quot;, &quot;N3&quot;); model.read(&quot;MusicMoz.n3&quot;, &quot;&quot;, &quot;N3&quot;); model.read(&quot;EVDB.n3&quot;, &quot;&quot;, &quot;N3&quot;); Now that we have configure both the RDF storage and the reasoner we can load all ontologies Note that the  read  method of  model  requires:  The name of the file to load,  The base URI (in our case all URI are absolute) and The RDF syntax in which data are serialized
D.1  Model the application ontology D.2  Model the content ontology R.1  Users’ needs analysis R.3  Software requirements analysis R.4 Content requirements analysis D.3  Model sample contents Reuse Merge Extend I.1  Implement the initial Knowledge Base V.1   Validation I.3  Choose content annotation methods I.2  Implement the integrated model Reuse Merge Extend I.4  Implement the application R.2  Risk analysis D.4  Design Application T.1  Testing
Testing the integrated model  A simple test, which we can perform to verify the semantic soundness of all the ontologies we modelled, consists in loading in the model the example we produced (in step D.3) and extracting the entire content of the RDF storage in a single file using the  write  method model.write(&quot;Dump.n3&quot;,&quot;N3&quot;); If we open the file  Dump.n3  we can verify the presence of all the inferred triple we presented in slide 89
D.1  Model the application ontology D.2  Model the content ontology R.1  Users’ needs analysis R.3  Software requirements analysis R.4 Content requirements analysis D.3  Model sample contents Reuse Merge Extend I.1  Implement the initial Knowledge Base V.1   Validation I.3  Choose content annotation methods I.2  Implement the integrated model Reuse Merge Extend I.4  Implement the application R.2  Risk analysis D.4  Design Application T.1  Testing
Choose content annotation methods Following the proposed approach, next step (i.e. I.3) suggests to choose content annotation methods The contents we choose for meex are already annotated at data source level, we (only) need to lift the data from XML or relational database as instances of the content ontology In the following slide we show how to implement and configure all the component necessary to allow meex to load data from the external data sources
meex interfaces (1) MusicBrainz database Adapter Database    RDF SPARQL Server EVDB REST service MusicMoz File XML meex XML Browser Web 3) HTML and RDF 2) RDF GRDDL processor EVDB     RDF MusicMoz    RDF XML 2) RDF 1 )  Music style User
Importing annotations from MusicBrainz The annotations of MusicBrainz are stored as dump of PostgreSQL database So, first of all we install the relational database PostgreSQL necessary documentation is available on PostgreSQL and MusicBrainz official websites When the database is available we need to install and configure  a  translator from relational database to RDF a SPARQL endpoint We choose  D2RQ  as translator and  Joseki  as SPARQL server
Configuring D2RQ for MusicBrainz (1) @prefix map: <http://swa.cefriel.it/meex/D2RQ-MusicBrainz.n3#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix owl: <http://www.w3.org/2002/07/owl#> . @prefix d2rq: <http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#>. @prefix mb: <http://musicbrainz.org/> . map:database a d2rq:Database;   d2rq:jdbcDriver &quot;org.postgresql.Driver&quot;;   d2rq:jdbcDSN &quot;jdbc:postgresql://localhost:5432/MusicBrainzDB&quot;;   d2rq:username &quot;postgres&quot;;   d2rq:password &quot;sw-book&quot;. [more to follow] D2RQ-MusicBrainzDB.n3 artist artist_relation id gid artist ref
Configuring D2RQ for MusicBrainz (1) [follows] map:artist a d2rq:ClassMap;   d2rq:dataStorage map:database;   d2rq:class mb:Artist;   d2rq:uriPattern &quot;http://musicbrainz.org/artist/@@artist.gid@@.html&quot;; map:artist_name a d2rq:PropertyBridge;   d2rq:belongsToClassMap map:artist;   d2rq:property rdfs:label;   d2rq:column &quot;artist.name&quot;. map:artist_relation a d2rq:PropertyBridge;  d2rq:belongsToClassMap map:artist;  d2rq:property mb:artist_relation;  d2rq:join &quot;artist.id = artist_relation.artist“;  d2rq:join &quot;artist_relation.ref = artist2.id&quot;;  d2rq:uriPattern &quot;http://musicbrainz.org/artist/@@artist2.gid@@.html&quot;.  D2RQ-MusicBrainzDB.n3 NOTE  due to a limitation of D2RQ we need to create a view of the Artist table create view Artist2 select * from Artist artist artist_relation id gid artist ref
Configuring Joseky for MusicBrainz []  rdf:type  joseki:Service ;   rdfs:label  &quot;SPARQL for MusicBrainzDB&quot; ;   joseki:serviceRef  &quot;MusicBrainz&quot; ;   joseki:dataset  _:MusicBrainzDS ;   joseki:processor  joseki:ProcessorSPARQL_FixedDS . _:MusicBrainzDS rdf:type ja:RDFDataset ;   ja:defaultGraph _:MusicBrainzModel ;   rdfs:label &quot;MusicBrainz Dataset&quot; . _:MusicBrainzModel rdf:type d2rq:D2RQModel ;   rdfs:label &quot;MusicBrainz D2RQ Model&quot; ;   d2rq:mappingFile <file:D2RQ-MusicBrainzDB.n3> ;   d2rq:resourceBaseURI <http://musicbrainz.org/> . joseki-config.ttl With row 1 we expose a SPARQL endpoint giving the name of the service  and the URL at which it will become accessible http://localhost:2020/MusicBrainz With row 2 and 3 we configure the SPARQL endpoint to expose MusicBrainz via D2RQ using the configuration file D2RQ-MusicBrainzDB.n3 (see previous slide)
Testing the SPARQL endpoint String sparqlQueryString = &quot;PREFIX mb: <http://musicbrainz.org/>\n“   + &quot;DESCRIBE <&quot; + artist + &quot;>&quot;; Query query = QueryFactory.create(sparqlQueryString); QueryExecution qexec = QueryExecutionFactory.sparqlService   (&quot;http://localhost:2020/MusicBrainz&quot;, query); Model resultModel = qexec.execDescribe() We choose  ARQ  to test the MusicBrainz SPARQL endpoint submitting a  DESCRIBE   SPARQL query to obtain the description of an artist With row 1 we define the SPARQL query in which the variable  artist  contains the URI of the artist we want to be described With row 2 and 3 we instantiate a query model and we configure the  QueryExecution  to send the query to the endpoint at the URL http://localhost:2020/MusicBrainz With row 4 we execute the query and we obtain a Jena model as a result
meex interfaces (2) MusicBrainz database Adapter Database    RDF SPARQL Server EVDB REST service MusicMoz File XML meex XML Browser Web 3) HTML and RDF 2) RDF GRDDL processor EVDB     RDF MusicMoz    RDF XML 2) RDF 1 )  Music style User
Importing annotations from  MusicMoz and EVDB The MasicBrainz SPARQL endpoint is ready, let’s imporing annotations from MusicMoz and EVDB. They both exchange data in XML.  In the design steps we chose to use a GRDDL processor to convert from XML in RDF (in the RDF/XML syntax) The GRDDL recommendation requires the XML documents to directly refer to the XSLT that performs the translation.  Neither MusicMoz nor EVDB XML files originally include the reference request by GRDDL We can programmatically add it In the following slide we show an excerpt of the modified XML files for MusicMoz We can proceed likewise for EVDB
Importing annotations from  MusicMoz (1) <musicmoz  xmlns:grddl='http://www.w3.org/2003/g/data-view#‘   grddl:transformation =&quot;file:///[...]/ musicmoz-to-rdf.xsl &quot;> <category name=&quot;Bands_and_Artists/B/Beatles,_The“   type=&quot;band&quot;> <resource name=&quot;musicbrainz&quot;    link=&quot;http://musicbrainz.org/artist/   b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d.html&quot;/> <from>England</from> <style number=&quot;1&quot;>British Invasion</style> <style number=&quot;2&quot;>Rock</style> <style number=&quot;3&quot;>Skiffle</style> </category> <style><name>British Invasion</name></style> <style><name>Rock</name></style> <style><name>Skiffle</name></style> </musicmoz> Excerpts from the files musicmoz.bandsandartists.xml and musicmoz.lists.styles.xml
Importing annotations from  MusicMoz (2) <xsl:template match=&quot;musicmoz/category[(@type='band' or @type='artist‘) and resource/@name='musicbrainz']&quot;> <xsl:variable name=&quot;artist_uri“ select=&quot;resource[@name='musicbrainz']/@link&quot;/> <xsl:for-each select=&quot;style&quot;> <xsl:variable name=&quot;style_reformatted“ select=&quot;concat('http://musicmoz.org/style/',text())&quot;/> <rdf:Description rdf:about=&quot;{ $artist_uri }&quot;> <mm:hasStyle rdf:resource=&quot;{ $style_reformatted }&quot;/> </rdf:Description> </xsl:for-each> <rdf:Description rdf:about=&quot;{ $artist_uri }&quot;> <mm:from><xsl:value-of select=&quot; from &quot;/></mm:from> </rdf:Description> </xsl:template> <xsl:template match=&quot;musicmoz/style&quot;> <xsl:variable name=&quot;style_reformatted&quot;  select=&quot;concat('http://musicmoz.org/style/', name)&quot;/> <mm:Style rdf:about=&quot;{ $style_reformatted }&quot;> <rdfs:label><xsl:value-of select=&quot; name &quot;/></rdfs:label> </mm:Style> </xsl:template> Excerpts from the file musicmoz-to-rdf.xsl
Importing annotations from  MusicMoz (3) As GRDDL processor we choose  GRDDL Reader , the GRDDL processor for Jena.  With row 1 we instantiate a Jena model that will momentarily contain the RDF data produce by the GRDDL processor With row 2 we instantiate a  RDFReader  that uses a GRDDL processor to load RDF data With row 3 and 4 we load in the RDF model instantiate in row 1 the data contained in the XML files of MusicMoz using the RDF reader configured for GRDDL With row 5 we merge the loaded RDF data with those already present in the RDF storage Model mmModel = ModelFactory.createDefaultModel(); RDFReader reader = mmModel.getReader(&quot;GRDDL&quot;); reader.read(mmModel, &quot;file:///.../musicmoz.bandsandartists.xml&quot;); reader.read(mmModel, &quot;file:///.../musicmoz.lists.styles.xml&quot;); model.add(mmModel);
So far so good! (1) MusicBrainz database Adapter Database    RDF SPARQL Server EVDB REST service MusicMoz File XML meex XML Browser Web 3) HTML and RDF 2) RDF GRDDL processor EVDB     RDF MusicMoz    RDF XML 2) RDF 1 )  Music style User
So far so good! (2) Ajax Web Framework  GRDDL Processor For each Artist SPARQL Client MusicBrainz SPARQL Endpoint HTTP REST Client EVDB  HTTP REST service GRDDL Processor EVDB     RDF MusicMoz    RDF Linking Artists to events RDF Merge Estrazione e trasformazione Ajax Web Framework  Music style Set of artist in RDF Artist SPARQL Query Events in XML Events in RDF Artists and events in RDF Artist data in RDF HTTP Query Dati RDF Artists and events in RDF
D.1  Model the application ontology D.2  Model the content ontology R.1  Users’ needs analysis R.3  Software requirements analysis R.4 Content requirements analysis D.3  Model sample contents Reuse Merge Extend I.1  Implement the initial Knowledge Base V.1   Validation I.3  Choose content annotation methods I.2  Implement the integrated model Reuse Merge Extend I.4  Implement the application R.2  Risk analysis D.4  Design Application T.1  Testing
What’s left? All the business logic that coordinates the interaction among the internal component is still to be implemented NOTE: implementing the business logic requires  both writing many lines of pure Java code  and work with several Semantic Web technologies we will focus our attention to the Semantic Web technologies  The complete Java code is available on meex the website for downloading. See  http://swa.cefriel.it/meex
What’s left? Ajax Web Framework  GRDDL Processor For each Artist SPARQL Client MusicBrainz SPARQL Endpoint HTTP REST Client EVDB  HTTP REST service GRDDL Processor EVDB     RDF MusicMoz    RDF Linking Artists to events RDF Merge Estrazione e trasformazione Ajax Web Framework  Music style Set of artist in RDF Artist SPARQL Query Events in XML Events in RDF Artists and events in RDF Artist data in RDF HTTP Query Dati RDF Artists and events in RDF
MEMO:  Execution Semantics (1) The user requests a music style meex access the local copy of MusicMoz and using the GRDDL processors obtains a set of artist that plays the given music style [more to follow]
Step 2 :  from the music style to the artists The step 2. of meex execution semantics requires to query MusicMoz for the artist that plays the music style requested by the users The following Java code shows how to encode the SPARQL query in terms of the application ontology String sparqlQueryString = &quot;PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n&quot; + &quot;PREFIX meex: <http://swa.cefriel.it/meex#>\n&quot; + &quot;SELECT DISTINCT ?performer \n&quot; + &quot;WHERE { ?performer meex:performsStyle ?style.\n&quot; + &quot;  ?style rdfs:label \&quot;&quot; + style + &quot;\&quot;.}&quot;;
MEMO:  Execution Semantics (2) [follows] For each artist meex : uses the SPARQL client to query the MusicBrainz SPARQL endpoint and it obtains the artist name and his/her relationships with other artist invokes the EVDB REST service, it obtains the events that refer to the artist in XML and uses the GRDDL processor to obtain this data in RDF links the data about each artist to the data about the events that refers to him/her [more to follow]
Step 3.a: querying MusicBrainz  The step 3.a of meex execution semantics requires to query MusicBrainz for the data that describe an artist including the related artists  String sparqlQueryString =    &quot;PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n”   + &quot;PREFIX mb: <http://musicbrainz.org/>\n”   + &quot;DESCRIBE <&quot;+ artist + &quot;>&quot;; SPARQLClient sparqlClient = new SPARQLClient(null); try { return sparqlClient.executeDescribeQuery(sparqlQueryString,   Config.MusicBrainzSPARQLEndpoint); } finally { sparqlClient.closeQuery(); } Excerpts from the file MusicBrainz.java
Step 3.b: querying EVDB The step 3.b of meex execution semantics requires to invoke the EVDB REST service, obtain the list of events in XML and use the GRDDL processor to obtain the RDF invokeHttpEndpoint(performerLabel, eventsFilename); prepareForGRDDL(eventsFilename); Model m = GRDDLProcessor.ApplyGRDDLTransformation(eventsFilename); private static void invokeHttpEndpoint(String keywords, String outputFilename) throws IOException { URL url = new URL(   &quot;http://api.evdb.com/rest/events/atom?sort_order=relevance&&quot;   + &quot;keywords=&quot; + URLEncoder.encode(keywords, &quot;UTF-8&quot;)   + &quot;&category=music&app_key=&quot;+Config.EVDBKey); URLConnection conn = url.openConnection(); conn.setDoOutput(true); BufferedReader in = new BufferedReader(new InputStreamReader(   conn.getInputStream())); […]  while ((inLine = in.readLine()) != null)  writer.write(inLine + &quot;\n&quot;); } Excerpts from the file EVDB.java
Step 3.c: linking artists to events The step 3.c of meex execution semantics requires to link the artist information retrieved from MusicMoz and MusicBrainz to the event information retrieved from EVDB We can use the following SPARQL  CONSTRUCT   query to create the links String sparqlQueryString = &quot;PREFIX meex: <http://swa.cefriel.it/meex#>\n&quot; + &quot;CONSTRUCT {<&quot; + performer + &quot;> meex:performsEvent ?event.}\n“ + &quot;WHERE {?event a meex:Event.}&quot;;
MEMO:  Execution Semantics (3) [follows] When all the peaces of information about artists and events are available in the RDF storage, meex extracts them and serializes them in the format of the Ajax Web framework The ajax Web framework allows the user for exploring the events found by meex When the user decides to start a new exploration, meex starts over from the beginning
Step 4: preparing the data for the GUI  We choose  Exhibit  as Ajax Web framework because allows facet browsing allows grouping and filtering events by artist name artist nationality the style the artist plays the related artists includes different views an ordered list a chronological graph a geographic map
Step 4: configuring Exhibit  We can configure Exhibit by the means of two files:  an HTML page that controls the look and feel and a JSON file that contains the data to be explored by the user In this tutorial we focus on the preparation of the JSON file. We refer to Exhibit documentation and the website of our Semantic Web book for the preparation of the HTML page of Exhibit for meex A JSON file is a simple text file that contains data organized in set of recors. In the following slide we show the information of The Beatles expressed in JSON.
Step 4: a sample JSON file type:  &quot;Event&quot;, label:  &quot;1964 The Tribute Tribute to Beatles&quot;, eventful_link:  &quot;http://eventful.com/events/   E0-001-006129372-5&quot;, when_startTime:  &quot;2008-01-25&quot;, when_endTime:  &quot;2008-01-26&quot;, where_label:  &quot;Paramount Theater&quot;, where_address:  &quot;17 South Street, New York 10940,    United States&quot;, where_latlng:  &quot;41.4544,-74.471&quot;, performer_label:  &quot;The Beatles&quot;, fromCountry:  &quot;England&quot;, styles:  [&quot;Skiffle&quot;, &quot;British Invasion&quot;, &quot;Rock&quot;], relatedPerformers:[&quot;The Beach Boys&quot;, &quot;Eric Clapton&quot;]
Step 4: serializing RDF in JSON In order to serialize RDF in JSON we extract the information we loaded in the RDF storage using the SPARQL query shown in the following slide we serialize the result in JSON NOTE: as we’ve already said several time, the query can be expressed in terms of the application ontology even if the data were loaded in other heterogeneous formats
Step 4: extracting the data  PREFIX rdfs:  <http://www.w3.org/2000/01/rdf-schema#> PREFIX meex:  <http://swa.cefriel.it/meex#> PREFIX gd:  <http://schemas.google.com/g/2005> SELECT DISTINCT ?event ?event_label ?when_startTime ?when_endTime ?where_label  ?where_address ?where_lat ?where_lon ?performer ?performer_label ?fromCountry WHERE { ?event rdfs:label ?event_label; meex:hasWhen ?when; meex:hasWhere ?where. ?when gd:startTime ?when_startTime; gd:endTime ?when_endTime. ?where gd:label ?where_label; gd:postalAddress ?where_address; gd:hasGeoPt ?geoPt. ?geoPt gd:lat ?where_lat; gd:lon ?where_lon. ?performer meex:performsEvent ?event; rdfs:label ?performer_label; meex:fromCountry ?fromCountry.}
Step 5 and 6
Step 5 and 6
Tools employed (1) Jena Application Framework http://jena.sourceforge.net Derby Relational database for the RDF storage http://db.apache.org/derby PostgreSQL Relational database for MusicBrainz http://www.postgresql.org   D2RQ Translator from relational database to RDF http://sites.wiwiss.fu-berlin.de/suhl/bizer/d2rq
Tools employed (2) Joseki SPARQL Endpoint Server  http://www.joseki.org ARQ SPARQL query engine for Jena http://jena.sourceforge.net/ARQ GRDDL Reader GRDDL processor http://jena.sourceforge.net/grddl Exhibit Ajax Web Framework  http://static.simile.mit.edu/exhibit
Thank you for paying attention Any Question?
Realizing a Semantic Web Application Emanuele Della Valle Dario Cerizza Irene Celino http://www.cefriel.it    http://swa.cefriel.it    emanuele.dellavalle@cefriel.it http://emanueledellavalle.org 7 th  Int. Semantic Web Conference  ISWC 2008 Karlsruhe, Germany, October 26, 2008 C enter of  E xcellence  F or  R esearch,  I nnovation,  E ducation and industrial  L ab partnership - Politecnico di Milano
Credits and Links CREDITS Dario Cerizza [dario.cerizza@cefriel.it] who help in concieving, designed and developed meex Irene Celino [irene.celino@cefriel.it] who help in concieving and support the design and development of meex All the people involved in CEFRIEL’s Semantic Web Activities Links Visit  http://swa.cefriel.it   Try  http://swa.cefriel.it/Squiggle   Try  http://swa.cefriel.it/SOIP-F   Acknowledgement  The research behind this tutorial is partially funded by the Italian research project NeP4B
Advertisement: if you speak Italian …

More Related Content

What's hot (20)

PDF
Linked Data Tutorial
Michael Hausenblas
 
PPTX
Introduction to the Semantic Web
Tomek Pluskiewicz
 
PPTX
Open belgium 2015 - open tourism
Raf Buyle
 
PPTX
Consuming Linked Data 4/5 Semtech2011
Juan Sequeda
 
PPTX
Semantic web
Pallavi Srivastava
 
PDF
From the Semantic Web to the Web of Data: ten years of linking up
Davide Palmisano
 
PDF
An introduction to Semantic Web and Linked Data
Gabriela Agustini
 
PPTX
Introduction to Linked Data
Juan Sequeda
 
PDF
Resource description framework
Stanley Wang
 
PDF
Semantic web technology
Stanley Wang
 
DOCX
Semantic web Document
ap
 
PDF
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
eswcsummerschool
 
PDF
Paul houle resume
Paul Houle
 
PPTX
Introduction to Linked Data 1/5
Juan Sequeda
 
PDF
Metadata is back!
Bernhard Haslhofer
 
PPTX
Linkator: enriching web pages by automatically adding dereferenceable semanti...
Samur Araujo
 
PPSX
An Introduction to Semantic Web Technology
Ankur Biswas
 
PPTX
IIIF Foundational Specifications
Robert Sanderson
 
PPTX
Linked Open Data in Romania
Vlad Posea
 
PPT
Webofdata
Bill Roberts
 
Linked Data Tutorial
Michael Hausenblas
 
Introduction to the Semantic Web
Tomek Pluskiewicz
 
Open belgium 2015 - open tourism
Raf Buyle
 
Consuming Linked Data 4/5 Semtech2011
Juan Sequeda
 
Semantic web
Pallavi Srivastava
 
From the Semantic Web to the Web of Data: ten years of linking up
Davide Palmisano
 
An introduction to Semantic Web and Linked Data
Gabriela Agustini
 
Introduction to Linked Data
Juan Sequeda
 
Resource description framework
Stanley Wang
 
Semantic web technology
Stanley Wang
 
Semantic web Document
ap
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
eswcsummerschool
 
Paul houle resume
Paul Houle
 
Introduction to Linked Data 1/5
Juan Sequeda
 
Metadata is back!
Bernhard Haslhofer
 
Linkator: enriching web pages by automatically adding dereferenceable semanti...
Samur Araujo
 
An Introduction to Semantic Web Technology
Ankur Biswas
 
IIIF Foundational Specifications
Robert Sanderson
 
Linked Open Data in Romania
Vlad Posea
 
Webofdata
Bill Roberts
 

Similar to Developing A Semantic Web Application - ISWC 2008 tutorial (20)

PPT
Introduction to Semantic Web for GIS Practitioners
Emanuele Della Valle
 
PPT
A Semantic Multimedia Web (Part 3)
Raphael Troncy
 
PPTX
Building Linked Data Applications
EUCLID project
 
PPT
Semantic Web, an introduction for bioscientists
Emanuele Della Valle
 
ODP
State of the Semantic Web
Ivan Herman
 
PPT
Semweb at the BBC
fantasticlife
 
PPT
Lodlam presentation v1.0 final al20151104
Asa Letourneau
 
PPT
DataPortability and Me: Introducing SIOC, FOAF and the Semantic Web
John Breslin
 
PDF
Pundit @ Den Haag
simonefonda
 
PPT
Semantic web
Aatif Hussain Warraich
 
PPT
Linked Data Workshop at I-Semantics 2010
Andreas Blumauer
 
PDF
Linked Data Generation for the University Data From Legacy Database
dannyijwest
 
PDF
lodlam summit session browsable linked data
Enno Meijers
 
PPT
Aplicații Web Semantice - Descriere Proiect
Vlad Posea
 
PPTX
Unit 5 World_Wide_Web.pptx
DhruvPatel189174
 
PDF
Semantic Metadata to Support Device Interaction in Smart Environments
Simon Mayer
 
PPTX
Doing Clever Things with the Semantic Web
Mathieu d'Aquin
 
PPTX
#opentourism - Linked Open Data Publishing and Discovery Workshop
Raf Buyle
 
KEY
Linked Data: so what?
MIUR
 
PPT
Future of Web 2.0 & The Semantic Web
is20090
 
Introduction to Semantic Web for GIS Practitioners
Emanuele Della Valle
 
A Semantic Multimedia Web (Part 3)
Raphael Troncy
 
Building Linked Data Applications
EUCLID project
 
Semantic Web, an introduction for bioscientists
Emanuele Della Valle
 
State of the Semantic Web
Ivan Herman
 
Semweb at the BBC
fantasticlife
 
Lodlam presentation v1.0 final al20151104
Asa Letourneau
 
DataPortability and Me: Introducing SIOC, FOAF and the Semantic Web
John Breslin
 
Pundit @ Den Haag
simonefonda
 
Linked Data Workshop at I-Semantics 2010
Andreas Blumauer
 
Linked Data Generation for the University Data From Legacy Database
dannyijwest
 
lodlam summit session browsable linked data
Enno Meijers
 
Aplicații Web Semantice - Descriere Proiect
Vlad Posea
 
Unit 5 World_Wide_Web.pptx
DhruvPatel189174
 
Semantic Metadata to Support Device Interaction in Smart Environments
Simon Mayer
 
Doing Clever Things with the Semantic Web
Mathieu d'Aquin
 
#opentourism - Linked Open Data Publishing and Discovery Workshop
Raf Buyle
 
Linked Data: so what?
MIUR
 
Future of Web 2.0 & The Semantic Web
is20090
 
Ad

More from Emanuele Della Valle (20)

PDF
Taming velocity - a tale of four streams
Emanuele Della Valle
 
PDF
Stream reasoning
Emanuele Della Valle
 
PPTX
Work in progress on Inductive Stream Reasoning
Emanuele Della Valle
 
PPTX
Big Data and Data Science W's
Emanuele Della Valle
 
PPT
Knowledge graphs in search engines
Emanuele Della Valle
 
PPTX
La città dei balocchi 2017 in numeri - Fluxedo
Emanuele Della Valle
 
PPTX
Stream Reasoning: a summary of ten years of research and a vision for the nex...
Emanuele Della Valle
 
PPTX
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
Emanuele Della Valle
 
PDF
Stream reasoning: an approach to tame the velocity and variety dimensions of ...
Emanuele Della Valle
 
PDF
Big Data: how to use it to create value
Emanuele Della Valle
 
PPTX
Listening to the pulse of our cities with Stream Reasoning (and few more tech...
Emanuele Della Valle
 
PPT
Ist16-04 An introduction to RDF
Emanuele Della Valle
 
PPT
Ist16-03 An Introduction to the Semantic Web
Emanuele Della Valle
 
PPT
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
Emanuele Della Valle
 
PPT
IST16-01 - Introduction to Interoperability and Semantic Technologies
Emanuele Della Valle
 
PDF
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Emanuele Della Valle
 
PPTX
On Stream Reasoning
Emanuele Della Valle
 
PPTX
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Emanuele Della Valle
 
PPTX
Social listener-brera-design-district-2015-03
Emanuele Della Valle
 
PDF
City Data Fusion for Event Management (in Italiano)
Emanuele Della Valle
 
Taming velocity - a tale of four streams
Emanuele Della Valle
 
Stream reasoning
Emanuele Della Valle
 
Work in progress on Inductive Stream Reasoning
Emanuele Della Valle
 
Big Data and Data Science W's
Emanuele Della Valle
 
Knowledge graphs in search engines
Emanuele Della Valle
 
La città dei balocchi 2017 in numeri - Fluxedo
Emanuele Della Valle
 
Stream Reasoning: a summary of ten years of research and a vision for the nex...
Emanuele Della Valle
 
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
Emanuele Della Valle
 
Stream reasoning: an approach to tame the velocity and variety dimensions of ...
Emanuele Della Valle
 
Big Data: how to use it to create value
Emanuele Della Valle
 
Listening to the pulse of our cities with Stream Reasoning (and few more tech...
Emanuele Della Valle
 
Ist16-04 An introduction to RDF
Emanuele Della Valle
 
Ist16-03 An Introduction to the Semantic Web
Emanuele Della Valle
 
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
Emanuele Della Valle
 
IST16-01 - Introduction to Interoperability and Semantic Technologies
Emanuele Della Valle
 
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Emanuele Della Valle
 
On Stream Reasoning
Emanuele Della Valle
 
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Emanuele Della Valle
 
Social listener-brera-design-district-2015-03
Emanuele Della Valle
 
City Data Fusion for Event Management (in Italiano)
Emanuele Della Valle
 
Ad

Recently uploaded (20)

PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 

Developing A Semantic Web Application - ISWC 2008 tutorial

  • 1. Realizing a Semantic Web Application Emanuele Della Valle Dario Cerizza Irene Celino http://www.cefriel.it http://swa.cefriel.it [email_address] http://emanueledellavalle.org 7 th Int. Semantic Web Conference ISWC 2008 Karlsruhe, Germany, October 26, 2008 C enter of E xcellence F or R esearch, I nnovation, E ducation and industrial L ab partnership - Politecnico di Milano
  • 2. Goal We will “develop” together an application of the Semantic Web we named Music Event Explorer or simply meex We will challenge the Semantic Web technologies in realizing a new service for Web users Using Transforming and Combining existing data
  • 3. Ingredients RDF as unified data model OWL as modelling language for the data sources GRDDL as a standard approach to translate in RDF the data stored in XML data sources D2RQ as tool to translate in RDF the data stored in relational data sources SPARQL as standard query language to access RDF data Jena as application framework to merge the various data in a single RDF model and manipulate it Joseky as tool to expose SPARQL endpoint ARQ as SPARQL client library A RDF storage to guarantee persistency A OWL reasoner to infer new knowledge Exhibit as user interface
  • 4. Approach In order to realize meex We start from the user need We derive user requirements We develop the ontologies and the software components While presenting we will explain the use of Semantic Web technologies and tools. (green background slides) A demonstrative installation of the application, together with the source code, is available at http://swa.cefriel.it/meex
  • 5. Towards a Semantic Web Ivan Herman in introducing the Semantic Web* explains: the current Web represents information using natural language (English, Hungarian, Chinese,…) graphics, multimedia, page layout Humans can process this easily can deduce facts from partial information can create mental associations are used to various sensory information (well, sort of… people with disabilities may have serious problems on the Web with rich media!) * http://www.w3.org/People/Ivan/CorePresentations/IntroThroughExample/
  • 6. Towards a Semantic Web Tasks often require to combine data on the Web: hotel and travel information may come from different sites searches in different digital libraries etc. Again, humans combine these information easily even if different terminology's are used!
  • 7. Towards a Semantic Web However: machines are ignorant! partial information is unusable difficult to make sense from, e.g., an image drawing analogies automatically is difficult difficult to combine information automatically is <foo:creator> same as <bar:author>? how to combine different XML hierarchies? …
  • 8. A user need for meex Imagine the users need to explore music events related to a given music style An event is a concert, a show or a workshop at which one or more artist participate. An artist is either a single musician or a band. For instance, if a user is interest in Folk music meex finds the artists that play Folk music searches for events of those artists allows the users to explore the events related to each artist as a list, on a time line and on a map
  • 9. A manual solution I open musicmoz [1] and I look up artists that play Folk music If the pages of the artists on musicmoz don’t satisfy me I navigate to musicbrainz [2] I look up in EVDB [3] if some of those artists have organized an event close to my location in these days I take note of the possible alternatives and I check how to get there using google maps [4] [1] http://www.musicmoz.org [2] http://www.musicbrainz.org [3] http://www.eventful.com [4] http://maps.google.com
  • 10. A manual solution I look up artists that play Folk music
  • 11. A manual solution I can learn more navigating to musicbrainz
  • 12. A manual solution I look up in EVDB if some of those artists have organized an event close to my location in these days
  • 13. A manual solution I take note of the possible alternatives and I check how to get there using google maps
  • 14. Music Event Explorer Of course I can do it manually, but I need the time to do so. Can’t I write a mash-up?
  • 15. What is needed? Ivan Herman in introducing the Semantic Web* explains (Some) data should be available for machines for further processing Data should be possibly combined, merged on a Web scale Sometimes, data may describe other data (like the library example, using metadata)… … but sometimes the data is to be exchanged by itself, like my calendar or my travel preferences Machines may also need to reason about that data * http://www.w3.org/People/Ivan/CorePresentations/IntroThroughExample/
  • 16. The rough structure of data integration Map the various data onto an abstract data representation make the data independent of its internal representation… Merge the resulting representations Start making queries on the whole! queries that could not have been done on the individual data sets
  • 17. The rough structure of data integration
  • 18. So where is the Semantic Web? The Semantic Web provides technologies to make such integration possible! For example: an abstract model for the relational graphs: RDF extract RDF information from XML (eg, XHTML) pages: GRDDL add structured information to XHTML pages: RDFa a query language adapted for the relational graphs: SPARQL characterize the relationships, categorize resources: RDFS , OWL , SKOS , Rules applications may choose among the different technologies some of them may be relatively simple with simple tools (RDFS), whereas some require sophisticated systems (OWL, Rules) reuse of existing “ontologies” that others have produced
  • 19. So where is the Semantic Web?
  • 20. A Semantic Web application is still an application! A Semantic Web application is still an application, thus we need to follow good practice from Software Engineering in developing it. We adopt a Spiral Model inspired by the famous Boehm spiral model We extend it with Knowledge Engineering practices
  • 21. D.1 Model the application ontology D.2 Model the content ontology R.1 Users’ needs analysis R.3 Software requirements analysis R.4 Content requirements analysis D.3 Model sample contents Reuse Merge Extend I.1 Implement the initial Knowledge Base V.1 Validation I.3 Choose content annotation methods I.2 Implement the integrated model Reuse Merge Extend I.4 Implement the application R.2 Risk analysis D.4 Design Application T.1 Testing
  • 22. D.1 Model the application ontology D.2 Model the content ontology R.1 Users’ needs analysis R.3 Software requirements analysis R.4 Content requirements analysis D.3 Model sample contents Reuse Merge Extend I.1 Implement the initial Knowledge Base V.1 Validation I.3 Choose content annotation methods I.2 Implement the integrated model Reuse Merge Extend I.4 Implement the application R.2 Risk analysis D.4 Design Application T.1 Testing
  • 23. D.1 Model the application ontology D.2 Model the content ontology R.1 Users’ needs analysis R.3 Software requirements analysis R.4 Content requirements analysis D.3 Model sample contents Reuse Merge Extend I.1 Implement the initial Knowledge Base V.1 Validation I.3 Choose content annotation methods I.2 Implement the integrated model Reuse Merge Extend I.4 Implement the application R.2 Risk analysis D.4 Design Application T.1 Testing
  • 24. Content requirements analysis Given we are developing a Semantic Web application is cruscial we reuse data already available on the Web EVDB - http://eventuful.com MusicBrainz - http://musicbrainz.org MusicMoz - http://musicmoz.org
  • 25. EVDB EVDB is a Web 2.0 website that makes available information about event all around the world For each event it knows The start data The end data The place in terms of address and geographic coordinates EVDB offers a Web API in the form of a REST service see http://api.evdb.com
  • 26. MusicBrainz MusicBrainz is a Web 2.0 website that gathered a large amount of information about music offers information about artists and bands songs, albums and tracks relations among artists and bands The data of MusicBrainz are available as a PostgreSQL dump see http://musicbrainz.org/doc/DatabaseDownload
  • 27. MusicMoz MusicMoz is another Web 2.0 website dedicated to music offers information about artists and bands including their nationality music styles and their taxonomic relationships the styles each artist or band plays reuses MusicBrainz identifier for artists and bands The data of MusicMoz are available as large XML files see http://musicmoz.org/xml/
  • 28. meex needs to merge this data meex in order to be able to manipulate all this data at the same time needs to merge the data of the three data sources. The artists and bands information from MusicBrainz should be linked to the music styles they play from MusicMoz the events related to them from EVDB
  • 29. Data Licences The data of all three data sources are freely usable, we just need to make sure that the logos of the three applications appears on each page of meex EVDB requests also to include a link to the permalink of the event on EVDB website MusicBrainz request also that derived data are made available in Creative Commons. Read out more here EVDB - http://api.eventful.com/terms MusicMoz - http://musicmoz.org/xml/license.html MusicBrainz - http://musicbrainz.org/doc/DatabaseDownload
  • 30. Application requirements analysis (1) In this step (namely R.3) we should elicit functional requirements of the application as grouping and filtering data non-functional requirements of the application as performance and scalability w.r.t. number of users However this is just a tutorial, therefore we concentrate on functional requirements, leaving non-functional requirements underspecified
  • 31. Application requirements analysis (2) Meex must enable a user to explore data in the form of a list a chronological graphic a geographic map for each event must show name begin and end date place for each artist must show name nationality music styles he/she plays related artists must allow users to filter and rank results
  • 32. D.1 Model the application ontology D.2 Model the content ontology R.1 Users’ needs analysis R.3 Software requirements analysis R.4 Content requirements analysis D.3 Model sample contents Reuse Merge Extend I.1 Implement the initial Knowledge Base V.1 Validation I.3 Choose content annotation methods I.2 Implement the integrated model Reuse Merge Extend I.4 Implement the application R.2 Risk analysis D.4 Design Application T.1 Testing
  • 33. Motivations for RDF Eric Prud'hommeaux explains* Simple, consistent data model Uses web architecture for web scalability Glamorous use cases * http://www.w3.org/2008/Talks/1027-ISWC/HCLS
  • 34. What does RDF provide? Common (simple) model to for all data. Incentive and infrastructure to re-use terms when possible and invent terms when necessary. Simple and complex ontological languages (RDFS and OWL). Intuitive re-use of now-familiar web topology. Scalable — partial (monotonic) reasoning allowed. Apps need not be re-written for each extension to a data model.
  • 35. How do we write RDF? Name resources and relationships with URIs e.g. http://emanueledellavalle.org/sw/foaf.rdf#me represents a person Express statements as subject, predicate, object Write the triples in RDF/XML: Standard serialization in XML <Description about=”subject”> <property>value</property> </Description> NTriples: Simple (verbose) reference serialization (for specifications only) ‏ <http:// ... subject> <http:// ... predicate> “value” . N3 and Turtle: Developer-friendly serializations :subject :property “value” .
  • 36. Turtle Introduction RDF triples analogous to one 3-place holds(s, p, o) predicate edv:me foaf:knows ic:me . holds(edv:me, foaf:knows, ic:me) ‏ Triples made from standard RDF terms: IRIs : <http://emanueledellavalle.org/sw/foaf.rdf#me> or <#me> or edv:me Literals: &quot;Emanuele Della Valle&quot; Typed literals: &quot;3.14&quot;^^xsd:float Literals with language tags: &quot; 日本語 &quot;@ja Blank nodes: [] or _:bob
  • 37. Convience Syntax URI terms can be abbreviated using namespaces @prefix edv: <http://emanueledellavalle.org/sw/foaf.rdf#> . @prefix rdf: <http://www.w3.org/1999/ 02/22-rdf-syntax-ns#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> edv:me rdf:type foaf:Person . ' a ' = <http://www.w3.org/1999/ 02/22-rdf-syntax-ns#type> edv:me a foaf:Person . In-line blank nodes edv:me foaf:knows [ foaf:name &quot;Irene Celino&quot; ] .
  • 38. Convience Syntax Abbreviating repeated subjects: edv:me rdf:type foaf:Person . edv:me foaf:knows ic:me . ... is the same as ... edv:me rdf:type foaf:Person ; foaf:knows ic:me . Abbreviating repeated subject/predicate pairs: edv:me foaf:knows ic:me . edv:me foaf:knows dc:me . ... is the same as ... edv:me foaf:knows ic:me , dc:me .
  • 39. Artist data in RDF Original XML data as in MusicMoz <category name=&quot;Bands_and_Artists/B/Beatles,_The“ type=&quot;band&quot;> <resource name=&quot;musicbrainz&quot; link=&quot;http://musicbrainz.org/artist/ b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d.html&quot;/> <style number=&quot;1&quot;>British Invasion</style> <style number=&quot;2&quot;>Rock</style> <style number=&quot;3&quot;>Skiffle</style> </category> The same data represented in RDF in meex mb:artist/b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d.html a mb:Band ; rdfs:label &quot;The Beatles&quot; ; mm:hasStyle mm:BritishInvation, mm:Rock, mm:Skiffle .
  • 40. RDF Resources RDF at the W3C - primer and specifications Semantic Web tools - community maintained list; includes triple store, programming environments, tool sets, and more 302 Semantic Web Videos and Podcasts - includes a section specifically on RDF videos
  • 41. RDFS/OWL in a nutshell: class and instance Creating a class RDFS: Artist rdf:type rdfs:Class . FOL:  x Artist(x) Creating a subclass RDFS: Painter rdfs:subClassOf Artist . RDFS: Sculptor rdfs:subClassOf Artist . FOL:  x [Painter(x)  Sculptor(x)  Artist(x)] Creating an instance RDFS: Rodin rdf:type Sculptor . FOL: Sculptor(Rodin) Artist Painter Sculptor Rodin
  • 42. RDFS/OWL in a nutshell: properties Creating a property RDFS: creates rdf:type rdf:Property . FOL:  x  y Creates(x,y) Using a property RDFS: Rodin creates TheKiss . FOL: Creates(Rodin, TheKiss) Creating subproperties RDFS: paints rdfs:subPropertyOf creates . FOL:  x  y [Paints(x,y)  Creates(x,y)] RDFS: sculpts rdfs:subPropertyOf creates . FOL:  x  y [Sculpts(x,y)  Creates(x,y)] creates paints
  • 43. RDFS/OWL in a nutshell: range & domain Checking which classes and predictes can be use together RDFS: creates rdfs:domain Artist . creates rdfs:range Piece . paints rdfs:domain Painter . paints rdfs:range Paint . sculpts rdfs:domain Sculptor . sculpts rdfs:range Sculpt . FOL:  x  y [Crea(x,y)  Artista(x)  Opera(y)]  x  y [Dipinge(x,y)  Pittore(x)  Pittura(y)]  x  y [Scolpisce(x,y)  Scultore(x)  Scultura(y)]
  • 45. Some Inference Rules if then x rdfs:subClassOf y . a rdf:type y . a rdf:type x . x rdfs:subClassOf y . x rdfs:subClassOf z . y rdfs:subClassOf z . x a y . x b y . a rdfs:subPropertyOf b . a rdfs:subPropertyOf b . a rdfs:subPropertyOf c . b rdfs:subPropertyOf c . x a y . x rdf:type z . a rdfs:domain z . x a u . u rdf:type z . a rdfs:range z .
  • 46. Inference at work Shared the ontology ... Sculptor rdf:subClassOf Artist . Painter rdf:subClassOf Artist . Painter owl:disjoinWith Sculptor . Sculpt rdf:subClassOf Piece. Painting rdf:subClassOf Piece . Painting owl:disjoinWith Sculpt. creates rdfs:domain Artist . creates rdfs:range Piece. sculpts rdfs:subPropertyOf creates . sculpts rdfs:domain Sculptor . sculpts rdfs:range Sculpt . ... when transmitting the following triple … Rodin sculpts TheKiss . … the recipient can …
  • 47. Inference at work … the recipient can answer the syntactic query Sculpts(Rodin,TheKiss)? yes but it can also answer queries such as Sculptor(Rodin)? yes Artist(Rodin)? yes Painter(Rodin)? no Sculpt(TheKiss)? yes Piece(TheKiss)? yes Painting(TheKiss)? no Creates(Rodin,TheKiss)? yes Painting(Rodin,TheKiss)? no NOTE : The recipient cannot give such answers without sharing the ontology (i.e., if the triple was an XML fragment valid w.r.t. a shared DTD)
  • 48. Model the Application Ontology As first design step (namely D.1) we model the application ontology meex must manage information related to artists events at which the artists participate and music styles the artists play
  • 49. Modeling Performer in OWL @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix owl: <http://www.w3.org/2002/07/owl#> . @prefix meex: <http://swa.cefriel.it/meex#> . meex:Performer a owl:Class ; rdfs:label &quot;Performer&quot; . meex:fromCountry a owl:DatatypeProperty ; rdfs:domain meex:Performer ; rdfs:range <http://www.w3.org/2001/XMLSchema#string> . meex:relatedPerformer a owl:ObjectProperty ; rdfs:domain meex:Performer ; rdfs:range meex:Performer . [more to follow] Meex.n3
  • 50. Modeling Style in OWL [follows] meex:Style a owl:Class . rdfs:label &quot;Music Style&quot; . meex:performsStyle a owl:ObjectProperty ; rdfs:domain meex:Performer ; rdfs:range meex:Style . [more to follow] Meex.n3
  • 51. Modeling Event in OWL [follows] meex:Event a owl:Class ; rdfs:label &quot;Event&quot; . meex:performsEvent a owl:ObjectProperty ; rdfs:domain meex:Performer ; rdfs:range meex:Event . meex:hasWhen a owl:ObjectProperty ; rdfs:domain meex:Event ; rdfs:range gd:When . meex:hasWhere a owl:ObjectProperty ; rdfs:domain meex:Event ; rdfs:range gd:Where Meex.n3 For each event we should model begin and end date together with the place, but an XML schema defined by Google exists; thus we decide to reuse it by merging it
  • 52. Modeling When in OWL [namespace declaration] gd:When a owl:Class; rdfs:label &quot;Time&quot; . gd:startTime a owl:DatatypeProperty ; rdfs:domain gd:When ; rdfs:range <http://www.w3.org/2001/XMLSchema#string> . gd:endTime a owl:DatatypeProperty ; rdfs:domain gd:When ; rdfs:range <http://www.w3.org/2001/XMLSchema#string> . [more to follow] GoogleSchema.n3
  • 53. Modeling Where in OWL gd:Where a owl:Class; rdfs:label &quot;Location&quot; . gd:postalAddress a owl:DatatypeProperty ; rdfs:domain gd:Where ; rdfs:range <http://www.w3.org/2001/XMLSchema#string>. gd:hasGeoPt a owl:ObjectProperty ; rdfs:domain gd:Where ; rdfs:range gd:GeoPt . gd:GeoPt a owl:Class ; rdfs:label &quot;Geo-referenced Point&quot; . gd:lat a owl:DatatypeProperty ; rdfs:domain gd:GeoPt ; rdfs:range <http://www.w3.org/2001/XMLSchema#string>. gd:lon a owl:DatatypeProperty ; rdfs:domain gd:GeoPt ; rdfs:range <http://www.w3.org/2001/XMLSchema#string>. gd:label rdfs:subPropertyOf rdfs:label . GoogleSchema.n3
  • 54. Model the content ontology We keep following our approach and we model the content ontology (step D.2) The content ontology models in OWL the data of the three data sources used by meex In the mean time we also model the sample contents (step D.3) that we will use to test meex during its implementation (see test-first method from Agile manifesto)
  • 55. D.1 Model the application ontology D.2 Model the content ontology R.1 Users’ needs analysis R.3 Software requirements analysis R.4 Content requirements analysis D.3 Model sample contents Reuse Merge Extend I.1 Implement the initial Knowledge Base V.1 Validation I.3 Choose content annotation methods I.2 Implement the integrated model Reuse Merge Extend I.4 Implement the application R.2 Risk analysis D.4 Design Application T.1 Testing
  • 56. Modeling MusicBrainz schema in OWL @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix owl: <http://www.w3.org/2002/07/owl#> . @prefix mb: <http://musicbrainz.org/> . mb:Artist a owl:Class ; rdfs:label &quot;MusicBrainz Artist and Band&quot; . mb:artist_relation a owl:ObjectProperty ; rdfs:domain mb:Artist ; rdfs:range mb:Artist . MusicBrainz.n3 artist artist_relation id gid artist ref
  • 57. Sample data for MusicBrainz in OWL mb:artist/b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d.html a mb:Artist ; rdfs:label &quot;The Beatles&quot; ; mb:related_artist mb:artist/ebfc1398-8d96-47e3-82c3-f782abcdb13d.html , mb:artist/618b6900-0618-4f1e-b835-bccb17f84294.html . mb:artist/ebfc1398-8d96-47e3-82c3-f782abcdb13d.html a mb:Artist ; rdfs:label &quot;The Beach Boys&quot; . mb:artist/618b6900-0618-4f1e-b835-bccb17f84294.html a mb:Artist ; rdfs:label &quot;Eric Clapton&quot; . SampleInstance-MusicBrainz.n3 Please note that we choose to build the URI using the ID that MusicBrainz uses to identify the artists. This allows for easier reuse of meex data in other applications
  • 58. MusicMoz schema category from * resource style 1 * name link name string type
  • 59. Modeling MusicMoz schema in OWL @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix owl: <http://www.w3.org/2002/07/owl#> . @prefix mm: <http://musicmoz.org/> . @prefix mb: <http://musicbrainz.org/> . mm:from a owl:DatatypeProperty ; rdfs:domain mb:Artist ; rdfs:range <http://www.w3.org/2001/XMLSchema#string>. mm:Style a owl:Class ; rdfs:label &quot;MusicMoz Music Style&quot; . mm:hasStyle a owl:ObjectProperty ; rdfs:domain mb:Artist ; rdfs:range mm:Style . MusicMoz.n3
  • 60. Sample data for MusicMoz in OWL mb:artist/b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d.html mm:from &quot;England&quot; ; mm:hasStyle mm:style/British-Invasion , mm:style/Rock , mm:style/Skiffle . mm:style/British-Invasion a mm:Style ; rdfs:label &quot;British Invasion&quot; . SampleInstance-MusicMoz.n3 Please note that also in this case we use the ID derived from MusicBrainz
  • 61. Modeling EVDB schema in OWL @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix owl: <http://www.w3.org/2002/07/owl#> . @prefix evdb: <http://eventful.com/> . @prefix gd: <http://schemas.google.com/g/2005> . evdb:Event a owl:Class ; rdfs:label &quot;Eventful Event&quot; . evdb:hasWhen a owl:ObjectProperty ; rdfs:domain evdb:Event ; rdfs:range gd:When . evdb:hasWhere a owl:ObjectProperty ; rdfs:domain evdb:Event ; rdfs:range gd:Where . EVDB.n3 Please note that we reuse the concepts When and Where we model in the application ontology by merging Google schema (see GoogleSchema.n3 ).
  • 62. Sample data for EVDB in OWL evdb:events/E0-001-008121669-0@2008022719 a evdb:Event ; gd:label &quot;Tell Me Why: A Beatles Commentary&quot; . evdb:hasWhen evdb:events/E0-001-008121669-0@2008022719_When; evdb:hasWhere evdb:events/E0-001-008121669-0@2008022719_Where. evdb:events/E0-001-008121669-0@2008022719_When gd:startTime &quot;2008-02-28&quot; ; gd:endTime &quot;2008-02-28&quot; . evdb:events/E0-001-008121669-0@2008022719_Where gd:hasGeoPt evdb:events/E0-001-008121669-0@2008022719_GeoPt ; gd:label &quot;The Wilmington Memorial Library&quot; ; gd:postalAddress &quot;175 Middlesex Avenue, Wilmington, USA&quot; . evdb:events/E0-001-008121669-0@2008022719_GeoPt gd:lat &quot;42.556943&quot; ; gd:lon &quot;-71.165576&quot; . SampleInstance-EVDB.n3
  • 63. “ Application Connected by Concepts ” artists Music styles events time places Meex ontology MusicBrainz EVDB MusicMoz Meex
  • 64. Why SPARQL? Eric Prud'hommeaux explains* that SPARQL is the query language of the Semantic Web. It lets us: Pull values from structured and semi-structured data Explore data by querying unknown relationships Perform complex joins of disparate databases in a single, simple query Transform RDF data from one vocabulary to another * http://www.w3.org/2008/Talks/1027-ISWC/HCLS
  • 65. SELECTing variables SPARQL variables bind to RDF terms Ex. ?artist, ?album, ?times_platinum Like SQL, we pick the variables we want from a query with a SELECT clause Ex. SELECT ?artist ?album ?times_platinum A SELECT query results in a table of values: ?artist ?album ?times_platinum Michael Jackson Thriller 27 Led Zeppelin Led Zeppelin IV 22 Pink Floyd The Wall 22
  • 66. Triple patterns A triple pattern is an RDF triple that can have variables in any of the subject, predicate, or object positions. Examples: Find countries and their capital cities: ?country geo:capital ?capital . Given a FOAF URI, find the person's name: edv:me foaf:name ?name . What direct relationships exist between two people? edv:me ?relationship ic:me .
  • 67. Simple query pattern We can combine more than one triple pattern to retrieve multiple values and easily traverse an RDF graph: Find countries, their capital cities, and their populations: ?country geo:capital ?capital ; geo:population ?population . Given a FOAF URI, find the person's name and friends' names: edv:me foaf:name ?name ; foaf:knows ?friend . ?friend foaf:name ?friend_name . Retrieve all third-line managers in the company: ?emp hr:managedBy ?first_line . ?first_line hr:managedBy ?second_line . ?second_line hr:managedBy ?third_line .
  • 68. Result forms Besides selecting tables of values, SPARQL allows three other types of queries: ASK - returns a boolean answering, does the query have any results? CONSTRUCT - uses variable bindings to return new RDF triples DESCRIBE - returns server-determined RDF about the queried resources SELECT and ASK results can be returned as XML or JSON. CONSTRUCT and DESCRIBE results can be returned via any RDF serialization (e.g. RDF/XML or Turtle).
  • 69. Protocol Mechanics The SPARQL Protocol is a simple method for asking and answering SPARQL queries over HTTP. A SPARQL URL is built from three parts: The URL of a SPARQL endpoint e.g. http://dbpedia.org/sparql ‏ (Optional, as part of the query string) The graphs to be queried against e.g. default-graph-uri=http://dbpedia.org (As part of the query string) The query itself e.g. Query= SELECT distinct ?x WHERE {?x a <http://umbel.org/umbel/sc/Artist> } This is the resulting URL http://dbpedia.org/sparql?default-graph-uri=https%3A%2F%2Fblue-sea-697d.quartiers047.workers.dev%3A443%2Fhttp%2Fdbpedia.org&query=SELECT+distinct+%3Fx+WHERE+{%3Fx+a+%3Chttps%3A%2F%2Fblue-sea-697d.quartiers047.workers.dev%3A443%2Fhttp%2Fumbel.org%2Fumbel%2Fsc%2FArtist%3E+}
  • 70. SPARQL Resources SPARQL Frequently Asked Questions SPARQL implementations - community maintained list of open-source and commercial SPARQL engines Public SPARQL endpoints - community maintained list SPARQL extensions - collection of SPARQL extensions implemented in various SPARQL engines
  • 71. D.1 Model the application ontology D.2 Model the content ontology R.1 Users’ needs analysis R.3 Software requirements analysis R.4 Content requirements analysis D.3 Model sample contents Reuse Merge Extend I.1 Implement the initial Knowledge Base V.1 Validation I.3 Choose content annotation methods I.2 Implement the integrated model Reuse Merge Extend I.4 Implement the application R.2 Risk analysis D.4 Design Application T.1 Testing
  • 72. Summary We are done with the modeling of ontologies and sample contents We can now design meex (step D.4 of our approach) In order to design meex architecture We first design its interfaces in terms of both graphic user interface and connection to the three data sources Secondly we design how it works inside in terms of components and execution semantics
  • 73. meex interfaces MusicBrainz database Adapter Database  RDF SPARQL Server EVDB REST service MusicMoz File XML meex XML Browser Web 3) HTML and RDF 2) RDF GRDDL processor EVDB  RDF MusicMoz  RDF XML 2) RDF 1 ) Music style User
  • 74. How we access the data In order to get RDF data out from the three external data source we can use different techniques For MusicBrainz database we can use tools that enable to query non-RDF databases as virtual RDF graphs using a standard SPARQL endpoint For MusicMoz XML files we can use a GRDDL processor using the XSLT MusicMoz->RDF For EVDB we can use a GRDDL processor applying the XSLT EVDB->RDF to the XML file obtained using the EVDB REST service
  • 75. User Interface In order to collect users’ input and to present results back to the users, we can use Web 2.0 technologies and develop an AJAX interface Such AJAX interface must allow for Inserting the music style, the resulting events will refer to Exploring the events found by meex Filtering the events based on Artists Their nationality The music style they play
  • 76. Designing how meex works inside Ajax Web Framework GRDDL Processor For each Artist SPARQL Client MusicBrainz SPARQL Endpoint HTTP REST Client EVDB HTTP REST service GRDDL Processor EVDB  RDF MusicMoz  RDF Linking Artists to Events RDF Merge Extraction and Transformation Ajax Web Framework Music style Set of artist in RDF Artist SPARQL Query Events in XML Events in RDF Artists and events in RDF Artist data in RDF HTTP Query Dati RDF Artists and events in RDF
  • 77. Execution Semantics (1) The user requests a music style meex access the local copy of MusicMoz and using the GRDDL processors obtains a set of artist that plays the given music style [more to follow]
  • 78. Execution Semantics (2) [follows] For each artist meex : uses the SPARQL client to query the MusicBrainz SPARQL endpoint and it obtains the artist name and his/her relationships with other artist invokes the EVDB REST service, it obtains the events that refer to the artist in XML and uses the GRDDL processor to obtain this data in RDF links the data about each artist to the data about the events that refers to him/her [more to follow]
  • 79. Execution Semantics (3) [follows] When all the peaces of information about artists and events are available in the RDF storage, meex extracts them and serializes them in the format of the Ajax Web framework The ajax Web framework allows the user for exploring the events found by meex When the user decides to start a new exploration, meex starts over from the beginning
  • 80. Two important internal components The RDF storage must be initialized with both the application and the content ontology is filled in with the data meex loads from the three data source given the music style requested by the user The reasoner allows all query in meex to be express in terms of the application ontology even if data are loaded from the data sources using the content ontology NOTE: the reasoner support the semantic integration of the data loaded from the external data sources. The meex’s programmer can ignore that multiple and heterogeneous data sources were used to load data
  • 81. D.1 Model the application ontology D.2 Model the content ontology R.1 Users’ needs analysis R.3 Software requirements analysis R.4 Content requirements analysis D.3 Model sample contents Reuse Merge Extend I.1 Implement the initial Knowledge Base V.1 Validation I.3 Choose content annotation methods I.2 Implement the integrated model Reuse Merge Extend I.4 Implement the application R.2 Risk analysis D.4 Design Application T.1 Testing
  • 82. Implement the initial Knowledge Base (1) We start implementing meex by setting up the initial knowledge base (step I.1) We need to select tools to read and write RDF in the RDF/XML and RDF/N3 syntax to manipulate programmatically RDF to store RDF to reason on OWL to interpret SPARQL
  • 83. Implement the initial Knowledge Base (2) We choose Jena because offers API to read and write different RDF syntax provides a programmatic environment for RDF, RDFS and OWL, SPARQL a guarantees RDF model persistence through several relational database adapters includes a rule-based inference engine which implement OWL semantics includes ARQ, a query engine that supports SPARQL In order to use the RDF storage and the OWL reasoner from Jena we need to configure them as shown in the following slides
  • 84. Configuring the RDF storage Class.forName(&quot;org.apache.derby.jdbc.EmbeddedDriver&quot;); DBConnection con = new DBConnection( &quot;jdbc:derby:C:/Meex/RDFStorage;create=true&quot;, &quot;sa&quot;, &quot;&quot;, &quot;Derby&quot;); Model model = ModelFactory.createModelRDBMaker(con). createDefaultModel(); We choose to use Derby (from Apache) as relational database underneath the RDF storage. With row 1 we tell Jena where to find the JDBC driver With row 2 we define the JDBC connection With row 3 we instantiate the object model of Jena we will use to access and manipulate the RDF model in the storage
  • 85. Configuring the OWL reasoner Reasoner reasoner = ReasonerRegistry.getOWLMicroReasoner(); model = ModelFactory.createInfModel(reasoner, model); Jena offers numerous options to configure the internal rule-based inference engine with different expressivity-performance tradeoffs We need simple reasoning features (i.e., subClassOf and subPropertyOf transitive closure), the OWL Micro configuration is, therefore, the most appropriate one With row 1 we instantiate a OWL micro reasoner With row 2 we instantiate a model with inference support using the model previously created and the OWL micro reasoner
  • 86. D.1 Model the application ontology D.2 Model the content ontology R.1 Users’ needs analysis R.3 Software requirements analysis R.4 Content requirements analysis D.3 Model sample contents Reuse Merge Extend I.1 Implement the initial Knowledge Base V.1 Validation I.3 Choose content annotation methods I.2 Implement the integrated model Reuse Merge Extend I.4 Implement the application R.2 Risk analysis D.4 Design Application T.1 Testing
  • 87. Implement the integrated model (1) We move on with the implementation of meex realizing the integrated model (step I.2) In the integrated model we merge application and content ontology Our intent is to integrate semantically the heterogeneous data coming from the external data sources In order to realize the integrated model we need to define a bridge ontology using the properties rdfs:subclassOf rdfs:subpropertyOf to connect classes and properties in the application ontology to those in the content ontology
  • 88. Implement the integrated model (2) mb:Artist rdfs:subClassOf meex:Performer . mb:related_artist rdfs:subPropertyOf meex:relatedPerformer. mm:Style rdfs:subClassOf meex:Style . mm:hasStyle rdfs:subPropertyOf meex:performsStyle . mm:from rdfs:subPropertyOf meex:fromCountry . evdb:Event rdfs:subClassOf meex:Event. evdb:hasWhen rdfs:subPropertyOf meex:hasWhen. evdb:hasWhere rdfs:subPropertyOf meex:hasWhere. In rows 1 and 2 we connect the ontology of MusicBrainz to the application ontology, i.e. the classes mb:Artist and meex:Performer the properties mb:related_artist and meex:relatedPerformer . Likewise, in rows 3, 4 and 5, we connect the ontology of MusicMoz to the application ontology and in rows 6, 7 and 8 we connect the ontology of EVDB to the application ontology
  • 89. Implement the integrated model (3) Thanks to this bridge ontology, when data loaded from the external data sources are inserted in the RDF storage (using the data source specific ontologies), the OWL micro reasoner infers the triples that represent the same data in the application ontology meex can, therefore, query the RDF storage homogeneously in the terms of application ontology without caring of the heterogeneous formats of the three data sources To give an idea of the differences, in the next slide we compare the data expressed in MusicBrainz ontology and in the application ontology
  • 90. Implement the integrated model (4) mb:artist/b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d.html a mb:Artist ; rdfs:label &quot;The Beatles&quot; ; mb:related_artist mb:artist/ebfc1398-8d96-47e3-82c3-f782abcdb13d.html, mb:artist/618b6900-0618-4f1e-b835-bccb17f84294.html. SampleInstance-MusicBrainz.n3 mb:artist/b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d.html a meex:Performer ; rdfs:label &quot;The Beatles&quot; ; meex:relatedPerformer mb:artist/ebfc1398-8d96-47e3-82c3-f782abcdb13d.html , mb:artist/618b6900-0618-4f1e-b835-bccb17f84294.html . Data-in-MusicBrainz-inferred-using-the-bridge-ontology.n3
  • 91. Implement the integrated model (5) model.read(&quot;Meex.n3&quot;, &quot;&quot;, &quot;N3&quot;); model.read(&quot;Google.n3&quot;, &quot;&quot;, &quot;N3&quot;); model.read(&quot;MeexBindings.n3&quot;, &quot;&quot;, &quot;N3&quot;); model.read(&quot;MusicBrainz.n3&quot;, &quot;&quot;, &quot;N3&quot;); model.read(&quot;MusicMoz.n3&quot;, &quot;&quot;, &quot;N3&quot;); model.read(&quot;EVDB.n3&quot;, &quot;&quot;, &quot;N3&quot;); Now that we have configure both the RDF storage and the reasoner we can load all ontologies Note that the read method of model requires: The name of the file to load, The base URI (in our case all URI are absolute) and The RDF syntax in which data are serialized
  • 92. D.1 Model the application ontology D.2 Model the content ontology R.1 Users’ needs analysis R.3 Software requirements analysis R.4 Content requirements analysis D.3 Model sample contents Reuse Merge Extend I.1 Implement the initial Knowledge Base V.1 Validation I.3 Choose content annotation methods I.2 Implement the integrated model Reuse Merge Extend I.4 Implement the application R.2 Risk analysis D.4 Design Application T.1 Testing
  • 93. Testing the integrated model A simple test, which we can perform to verify the semantic soundness of all the ontologies we modelled, consists in loading in the model the example we produced (in step D.3) and extracting the entire content of the RDF storage in a single file using the write method model.write(&quot;Dump.n3&quot;,&quot;N3&quot;); If we open the file Dump.n3 we can verify the presence of all the inferred triple we presented in slide 89
  • 94. D.1 Model the application ontology D.2 Model the content ontology R.1 Users’ needs analysis R.3 Software requirements analysis R.4 Content requirements analysis D.3 Model sample contents Reuse Merge Extend I.1 Implement the initial Knowledge Base V.1 Validation I.3 Choose content annotation methods I.2 Implement the integrated model Reuse Merge Extend I.4 Implement the application R.2 Risk analysis D.4 Design Application T.1 Testing
  • 95. Choose content annotation methods Following the proposed approach, next step (i.e. I.3) suggests to choose content annotation methods The contents we choose for meex are already annotated at data source level, we (only) need to lift the data from XML or relational database as instances of the content ontology In the following slide we show how to implement and configure all the component necessary to allow meex to load data from the external data sources
  • 96. meex interfaces (1) MusicBrainz database Adapter Database  RDF SPARQL Server EVDB REST service MusicMoz File XML meex XML Browser Web 3) HTML and RDF 2) RDF GRDDL processor EVDB  RDF MusicMoz  RDF XML 2) RDF 1 ) Music style User
  • 97. Importing annotations from MusicBrainz The annotations of MusicBrainz are stored as dump of PostgreSQL database So, first of all we install the relational database PostgreSQL necessary documentation is available on PostgreSQL and MusicBrainz official websites When the database is available we need to install and configure a translator from relational database to RDF a SPARQL endpoint We choose D2RQ as translator and Joseki as SPARQL server
  • 98. Configuring D2RQ for MusicBrainz (1) @prefix map: <http://swa.cefriel.it/meex/D2RQ-MusicBrainz.n3#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix owl: <http://www.w3.org/2002/07/owl#> . @prefix d2rq: <http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#>. @prefix mb: <http://musicbrainz.org/> . map:database a d2rq:Database; d2rq:jdbcDriver &quot;org.postgresql.Driver&quot;; d2rq:jdbcDSN &quot;jdbc:postgresql://localhost:5432/MusicBrainzDB&quot;; d2rq:username &quot;postgres&quot;; d2rq:password &quot;sw-book&quot;. [more to follow] D2RQ-MusicBrainzDB.n3 artist artist_relation id gid artist ref
  • 99. Configuring D2RQ for MusicBrainz (1) [follows] map:artist a d2rq:ClassMap; d2rq:dataStorage map:database; d2rq:class mb:Artist; d2rq:uriPattern &quot;http://musicbrainz.org/artist/@@artist.gid@@.html&quot;; map:artist_name a d2rq:PropertyBridge; d2rq:belongsToClassMap map:artist; d2rq:property rdfs:label; d2rq:column &quot;artist.name&quot;. map:artist_relation a d2rq:PropertyBridge; d2rq:belongsToClassMap map:artist; d2rq:property mb:artist_relation; d2rq:join &quot;artist.id = artist_relation.artist“; d2rq:join &quot;artist_relation.ref = artist2.id&quot;; d2rq:uriPattern &quot;http://musicbrainz.org/artist/@@artist2.gid@@.html&quot;. D2RQ-MusicBrainzDB.n3 NOTE due to a limitation of D2RQ we need to create a view of the Artist table create view Artist2 select * from Artist artist artist_relation id gid artist ref
  • 100. Configuring Joseky for MusicBrainz [] rdf:type joseki:Service ; rdfs:label &quot;SPARQL for MusicBrainzDB&quot; ; joseki:serviceRef &quot;MusicBrainz&quot; ; joseki:dataset _:MusicBrainzDS ; joseki:processor joseki:ProcessorSPARQL_FixedDS . _:MusicBrainzDS rdf:type ja:RDFDataset ; ja:defaultGraph _:MusicBrainzModel ; rdfs:label &quot;MusicBrainz Dataset&quot; . _:MusicBrainzModel rdf:type d2rq:D2RQModel ; rdfs:label &quot;MusicBrainz D2RQ Model&quot; ; d2rq:mappingFile <file:D2RQ-MusicBrainzDB.n3> ; d2rq:resourceBaseURI <http://musicbrainz.org/> . joseki-config.ttl With row 1 we expose a SPARQL endpoint giving the name of the service and the URL at which it will become accessible http://localhost:2020/MusicBrainz With row 2 and 3 we configure the SPARQL endpoint to expose MusicBrainz via D2RQ using the configuration file D2RQ-MusicBrainzDB.n3 (see previous slide)
  • 101. Testing the SPARQL endpoint String sparqlQueryString = &quot;PREFIX mb: <http://musicbrainz.org/>\n“ + &quot;DESCRIBE <&quot; + artist + &quot;>&quot;; Query query = QueryFactory.create(sparqlQueryString); QueryExecution qexec = QueryExecutionFactory.sparqlService (&quot;http://localhost:2020/MusicBrainz&quot;, query); Model resultModel = qexec.execDescribe() We choose ARQ to test the MusicBrainz SPARQL endpoint submitting a DESCRIBE SPARQL query to obtain the description of an artist With row 1 we define the SPARQL query in which the variable artist contains the URI of the artist we want to be described With row 2 and 3 we instantiate a query model and we configure the QueryExecution to send the query to the endpoint at the URL http://localhost:2020/MusicBrainz With row 4 we execute the query and we obtain a Jena model as a result
  • 102. meex interfaces (2) MusicBrainz database Adapter Database  RDF SPARQL Server EVDB REST service MusicMoz File XML meex XML Browser Web 3) HTML and RDF 2) RDF GRDDL processor EVDB  RDF MusicMoz  RDF XML 2) RDF 1 ) Music style User
  • 103. Importing annotations from MusicMoz and EVDB The MasicBrainz SPARQL endpoint is ready, let’s imporing annotations from MusicMoz and EVDB. They both exchange data in XML. In the design steps we chose to use a GRDDL processor to convert from XML in RDF (in the RDF/XML syntax) The GRDDL recommendation requires the XML documents to directly refer to the XSLT that performs the translation. Neither MusicMoz nor EVDB XML files originally include the reference request by GRDDL We can programmatically add it In the following slide we show an excerpt of the modified XML files for MusicMoz We can proceed likewise for EVDB
  • 104. Importing annotations from MusicMoz (1) <musicmoz xmlns:grddl='http://www.w3.org/2003/g/data-view#‘ grddl:transformation =&quot;file:///[...]/ musicmoz-to-rdf.xsl &quot;> <category name=&quot;Bands_and_Artists/B/Beatles,_The“ type=&quot;band&quot;> <resource name=&quot;musicbrainz&quot; link=&quot;http://musicbrainz.org/artist/ b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d.html&quot;/> <from>England</from> <style number=&quot;1&quot;>British Invasion</style> <style number=&quot;2&quot;>Rock</style> <style number=&quot;3&quot;>Skiffle</style> </category> <style><name>British Invasion</name></style> <style><name>Rock</name></style> <style><name>Skiffle</name></style> </musicmoz> Excerpts from the files musicmoz.bandsandartists.xml and musicmoz.lists.styles.xml
  • 105. Importing annotations from MusicMoz (2) <xsl:template match=&quot;musicmoz/category[(@type='band' or @type='artist‘) and resource/@name='musicbrainz']&quot;> <xsl:variable name=&quot;artist_uri“ select=&quot;resource[@name='musicbrainz']/@link&quot;/> <xsl:for-each select=&quot;style&quot;> <xsl:variable name=&quot;style_reformatted“ select=&quot;concat('http://musicmoz.org/style/',text())&quot;/> <rdf:Description rdf:about=&quot;{ $artist_uri }&quot;> <mm:hasStyle rdf:resource=&quot;{ $style_reformatted }&quot;/> </rdf:Description> </xsl:for-each> <rdf:Description rdf:about=&quot;{ $artist_uri }&quot;> <mm:from><xsl:value-of select=&quot; from &quot;/></mm:from> </rdf:Description> </xsl:template> <xsl:template match=&quot;musicmoz/style&quot;> <xsl:variable name=&quot;style_reformatted&quot; select=&quot;concat('http://musicmoz.org/style/', name)&quot;/> <mm:Style rdf:about=&quot;{ $style_reformatted }&quot;> <rdfs:label><xsl:value-of select=&quot; name &quot;/></rdfs:label> </mm:Style> </xsl:template> Excerpts from the file musicmoz-to-rdf.xsl
  • 106. Importing annotations from MusicMoz (3) As GRDDL processor we choose GRDDL Reader , the GRDDL processor for Jena. With row 1 we instantiate a Jena model that will momentarily contain the RDF data produce by the GRDDL processor With row 2 we instantiate a RDFReader that uses a GRDDL processor to load RDF data With row 3 and 4 we load in the RDF model instantiate in row 1 the data contained in the XML files of MusicMoz using the RDF reader configured for GRDDL With row 5 we merge the loaded RDF data with those already present in the RDF storage Model mmModel = ModelFactory.createDefaultModel(); RDFReader reader = mmModel.getReader(&quot;GRDDL&quot;); reader.read(mmModel, &quot;file:///.../musicmoz.bandsandartists.xml&quot;); reader.read(mmModel, &quot;file:///.../musicmoz.lists.styles.xml&quot;); model.add(mmModel);
  • 107. So far so good! (1) MusicBrainz database Adapter Database  RDF SPARQL Server EVDB REST service MusicMoz File XML meex XML Browser Web 3) HTML and RDF 2) RDF GRDDL processor EVDB  RDF MusicMoz  RDF XML 2) RDF 1 ) Music style User
  • 108. So far so good! (2) Ajax Web Framework GRDDL Processor For each Artist SPARQL Client MusicBrainz SPARQL Endpoint HTTP REST Client EVDB HTTP REST service GRDDL Processor EVDB  RDF MusicMoz  RDF Linking Artists to events RDF Merge Estrazione e trasformazione Ajax Web Framework Music style Set of artist in RDF Artist SPARQL Query Events in XML Events in RDF Artists and events in RDF Artist data in RDF HTTP Query Dati RDF Artists and events in RDF
  • 109. D.1 Model the application ontology D.2 Model the content ontology R.1 Users’ needs analysis R.3 Software requirements analysis R.4 Content requirements analysis D.3 Model sample contents Reuse Merge Extend I.1 Implement the initial Knowledge Base V.1 Validation I.3 Choose content annotation methods I.2 Implement the integrated model Reuse Merge Extend I.4 Implement the application R.2 Risk analysis D.4 Design Application T.1 Testing
  • 110. What’s left? All the business logic that coordinates the interaction among the internal component is still to be implemented NOTE: implementing the business logic requires both writing many lines of pure Java code and work with several Semantic Web technologies we will focus our attention to the Semantic Web technologies The complete Java code is available on meex the website for downloading. See http://swa.cefriel.it/meex
  • 111. What’s left? Ajax Web Framework GRDDL Processor For each Artist SPARQL Client MusicBrainz SPARQL Endpoint HTTP REST Client EVDB HTTP REST service GRDDL Processor EVDB  RDF MusicMoz  RDF Linking Artists to events RDF Merge Estrazione e trasformazione Ajax Web Framework Music style Set of artist in RDF Artist SPARQL Query Events in XML Events in RDF Artists and events in RDF Artist data in RDF HTTP Query Dati RDF Artists and events in RDF
  • 112. MEMO: Execution Semantics (1) The user requests a music style meex access the local copy of MusicMoz and using the GRDDL processors obtains a set of artist that plays the given music style [more to follow]
  • 113. Step 2 : from the music style to the artists The step 2. of meex execution semantics requires to query MusicMoz for the artist that plays the music style requested by the users The following Java code shows how to encode the SPARQL query in terms of the application ontology String sparqlQueryString = &quot;PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n&quot; + &quot;PREFIX meex: <http://swa.cefriel.it/meex#>\n&quot; + &quot;SELECT DISTINCT ?performer \n&quot; + &quot;WHERE { ?performer meex:performsStyle ?style.\n&quot; + &quot; ?style rdfs:label \&quot;&quot; + style + &quot;\&quot;.}&quot;;
  • 114. MEMO: Execution Semantics (2) [follows] For each artist meex : uses the SPARQL client to query the MusicBrainz SPARQL endpoint and it obtains the artist name and his/her relationships with other artist invokes the EVDB REST service, it obtains the events that refer to the artist in XML and uses the GRDDL processor to obtain this data in RDF links the data about each artist to the data about the events that refers to him/her [more to follow]
  • 115. Step 3.a: querying MusicBrainz The step 3.a of meex execution semantics requires to query MusicBrainz for the data that describe an artist including the related artists String sparqlQueryString = &quot;PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n” + &quot;PREFIX mb: <http://musicbrainz.org/>\n” + &quot;DESCRIBE <&quot;+ artist + &quot;>&quot;; SPARQLClient sparqlClient = new SPARQLClient(null); try { return sparqlClient.executeDescribeQuery(sparqlQueryString, Config.MusicBrainzSPARQLEndpoint); } finally { sparqlClient.closeQuery(); } Excerpts from the file MusicBrainz.java
  • 116. Step 3.b: querying EVDB The step 3.b of meex execution semantics requires to invoke the EVDB REST service, obtain the list of events in XML and use the GRDDL processor to obtain the RDF invokeHttpEndpoint(performerLabel, eventsFilename); prepareForGRDDL(eventsFilename); Model m = GRDDLProcessor.ApplyGRDDLTransformation(eventsFilename); private static void invokeHttpEndpoint(String keywords, String outputFilename) throws IOException { URL url = new URL( &quot;http://api.evdb.com/rest/events/atom?sort_order=relevance&&quot; + &quot;keywords=&quot; + URLEncoder.encode(keywords, &quot;UTF-8&quot;) + &quot;&category=music&app_key=&quot;+Config.EVDBKey); URLConnection conn = url.openConnection(); conn.setDoOutput(true); BufferedReader in = new BufferedReader(new InputStreamReader( conn.getInputStream())); […] while ((inLine = in.readLine()) != null) writer.write(inLine + &quot;\n&quot;); } Excerpts from the file EVDB.java
  • 117. Step 3.c: linking artists to events The step 3.c of meex execution semantics requires to link the artist information retrieved from MusicMoz and MusicBrainz to the event information retrieved from EVDB We can use the following SPARQL CONSTRUCT query to create the links String sparqlQueryString = &quot;PREFIX meex: <http://swa.cefriel.it/meex#>\n&quot; + &quot;CONSTRUCT {<&quot; + performer + &quot;> meex:performsEvent ?event.}\n“ + &quot;WHERE {?event a meex:Event.}&quot;;
  • 118. MEMO: Execution Semantics (3) [follows] When all the peaces of information about artists and events are available in the RDF storage, meex extracts them and serializes them in the format of the Ajax Web framework The ajax Web framework allows the user for exploring the events found by meex When the user decides to start a new exploration, meex starts over from the beginning
  • 119. Step 4: preparing the data for the GUI We choose Exhibit as Ajax Web framework because allows facet browsing allows grouping and filtering events by artist name artist nationality the style the artist plays the related artists includes different views an ordered list a chronological graph a geographic map
  • 120. Step 4: configuring Exhibit We can configure Exhibit by the means of two files: an HTML page that controls the look and feel and a JSON file that contains the data to be explored by the user In this tutorial we focus on the preparation of the JSON file. We refer to Exhibit documentation and the website of our Semantic Web book for the preparation of the HTML page of Exhibit for meex A JSON file is a simple text file that contains data organized in set of recors. In the following slide we show the information of The Beatles expressed in JSON.
  • 121. Step 4: a sample JSON file type: &quot;Event&quot;, label: &quot;1964 The Tribute Tribute to Beatles&quot;, eventful_link: &quot;http://eventful.com/events/ E0-001-006129372-5&quot;, when_startTime: &quot;2008-01-25&quot;, when_endTime: &quot;2008-01-26&quot;, where_label: &quot;Paramount Theater&quot;, where_address: &quot;17 South Street, New York 10940, United States&quot;, where_latlng: &quot;41.4544,-74.471&quot;, performer_label: &quot;The Beatles&quot;, fromCountry: &quot;England&quot;, styles: [&quot;Skiffle&quot;, &quot;British Invasion&quot;, &quot;Rock&quot;], relatedPerformers:[&quot;The Beach Boys&quot;, &quot;Eric Clapton&quot;]
  • 122. Step 4: serializing RDF in JSON In order to serialize RDF in JSON we extract the information we loaded in the RDF storage using the SPARQL query shown in the following slide we serialize the result in JSON NOTE: as we’ve already said several time, the query can be expressed in terms of the application ontology even if the data were loaded in other heterogeneous formats
  • 123. Step 4: extracting the data PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX meex: <http://swa.cefriel.it/meex#> PREFIX gd: <http://schemas.google.com/g/2005> SELECT DISTINCT ?event ?event_label ?when_startTime ?when_endTime ?where_label ?where_address ?where_lat ?where_lon ?performer ?performer_label ?fromCountry WHERE { ?event rdfs:label ?event_label; meex:hasWhen ?when; meex:hasWhere ?where. ?when gd:startTime ?when_startTime; gd:endTime ?when_endTime. ?where gd:label ?where_label; gd:postalAddress ?where_address; gd:hasGeoPt ?geoPt. ?geoPt gd:lat ?where_lat; gd:lon ?where_lon. ?performer meex:performsEvent ?event; rdfs:label ?performer_label; meex:fromCountry ?fromCountry.}
  • 126. Tools employed (1) Jena Application Framework http://jena.sourceforge.net Derby Relational database for the RDF storage http://db.apache.org/derby PostgreSQL Relational database for MusicBrainz http://www.postgresql.org D2RQ Translator from relational database to RDF http://sites.wiwiss.fu-berlin.de/suhl/bizer/d2rq
  • 127. Tools employed (2) Joseki SPARQL Endpoint Server http://www.joseki.org ARQ SPARQL query engine for Jena http://jena.sourceforge.net/ARQ GRDDL Reader GRDDL processor http://jena.sourceforge.net/grddl Exhibit Ajax Web Framework http://static.simile.mit.edu/exhibit
  • 128. Thank you for paying attention Any Question?
  • 129. Realizing a Semantic Web Application Emanuele Della Valle Dario Cerizza Irene Celino http://www.cefriel.it http://swa.cefriel.it [email protected] http://emanueledellavalle.org 7 th Int. Semantic Web Conference ISWC 2008 Karlsruhe, Germany, October 26, 2008 C enter of E xcellence F or R esearch, I nnovation, E ducation and industrial L ab partnership - Politecnico di Milano
  • 130. Credits and Links CREDITS Dario Cerizza [[email protected]] who help in concieving, designed and developed meex Irene Celino [[email protected]] who help in concieving and support the design and development of meex All the people involved in CEFRIEL’s Semantic Web Activities Links Visit http://swa.cefriel.it Try http://swa.cefriel.it/Squiggle Try http://swa.cefriel.it/SOIP-F Acknowledgement The research behind this tutorial is partially funded by the Italian research project NeP4B
  • 131. Advertisement: if you speak Italian …