SlideShare a Scribd company logo
Engineering a
Semantic Web
ITWS Capstone (Spring 2014)
John S. Erickson, Ph.D ❇ Tetherless World Constellation ❇ RPI
Understanding...
Consuming...
Building...
Doing Science with...
...the Web of Data
Objectives
● Deeper understanding of Web architecture
● Understand the Semantic Web stack
● Linked Data principles & practices
● Build cool applications
● Contribute to the Web of Data
● Make better IT management decisions
First: Consuming the Web of Data
1. From the Web of Documents to the Web of Data
2. Linked Data: Building Blocks of the Web (of data)
3. Mashups: Consuming Linked Data
Web Architecture
First Principles of the Web...
● A standard system for identifying resources
● Standard formats for representing resources
● A standard protocol for exchanging resources
Relevant core standards:
● URIs (URLs): Uniform Resource Identifiers
● HTML: Hypertext Markup Language
● HTTP: Hypertext Transfer Protocol
Architecture of the World Wide Web, Volume One http://www.w3.org/TR/webarch/
Data Mining: Mapping the Blogosphere http://bit.ly/18MuXdD
Identifying Web Resources (1)
● A global identification system is essential
○ to share information about resources
○ to reason about resources
○ to modify or exchange resources
● Resources are anything that can be linked to or spoken of
○ Documents, cat videos, people, ideas...
● Not all resources are "on" the Web
○ They might be referenced from the Web...
○ ...while not being retrievable from it
○ These are (so called) "information resources"
Les Carr, et.al. http://slidesha.re/142MFrV
Identifying Web Resources (2)
● A global standard is required; the URI is it
● Others systems are possible...
○ ...but added value of a single global system of identifiers is high
○ Enables linking, bookmarking and other functions across
heterogeneous applications
● How are URI used?
○ All resources have URIs associated with them
○ Each URI identifies a single resource in a context- independent
manner
○ URIs act as names and (usually) addresses
○ In general URIs are "opaque"
Uniform Resource Identifier (URI): Generic Syntax (RFC 3986) http://www.ietf.org/rfc/rfc3986.txt
Identifying Web Resources (3)
● "URIs identify and URLs locate..."
○ ...and identify
● URLs are URIs aligned with protocols
○ URLs include the "access mechanism" or "network location", e.g. http:
// or ftp://
○ How to "dereference" the URI and retrieve the thing
● URL examples
○ ftp://ftp.is.co.za/rfc/rfc1808.txt
○ http://www.ietf.org/rfc/rfc2396.txt
○ mailto:John.Doe@example.com
○ telnet://192.0.2.16:80/
Uniform Resource Identifier (URI): Generic Syntax (RFC 3986) http://www.ietf.org/rfc/rfc3986.txt
Representing Resources (1)
● Resources are manifest as digital files
○ More precisely: serializations that look like files...
● The Web recognizes a (growing) set of {file | serialization} formats
○ The original and workhorse is HTML...
○ ...but there are many others
● Retrievable resources on the web serve multiple purposes
○ Resources encode information and data
○ Resources aggregate links to other resources
● This is what makes The Web(tm) a "web..."
Resources (nodes)
aggregate links to
other resources to
create a Web
Retrieving Resources (1)
● Review: URLs refer to retrievable resources
○ ie URIs that specify some protocol for retrieval
● The original and most common Web protocol is HTTP
● Specialized protocols are possible but resources may
appear "off the grid..."
● More common case is HTTP w different formats...
URIs, HTTP, many formats...
http://www.w3.org/2006/Talks/0521-sb-AC-management/ReCTechStack-bg.png
Principles for creating a healthy Web
Tim Berners-Lee http://www.w3.org/DesignIssues/LinkedData.html
● Use URIs as names for things
● Use HTTP URIs so people can look up those names
● When someone looks up a URI, return useful information
○ use standard representation formats to express it
● Include links to other URIs, so consumers can discover more things
○ By "consumers" we mean people or applications
Why is linking important???
Implications of a well-connected Web
Links to other nodes
are a "vote" of quality
and/or relevance
PageRank https://en.wikipedia.org/wiki/PageRank
Google PageRank
What's this "Semantic Web?"
...and where can I get one???
“Web of meaning”
Web of Data
Linked Data
Linking ideas...
Semantic Web?
meaning
:
ideas
:
data
Semantic Web Building Blocks
subject object
predicate
RDF: Resource Description Framework
subject object
“article”
“James
Hendler”
predicate
“has creator”
RDF: Resource Description Framework
http://dbpedia.
org/resource/James_Hendler
doi:10.1109/MC.2009.30
http://purl.org/dc/elements/1.1/creator
http://purl.org/dc/elements/1.1/creator
http://dx.doi.org//10.1109/MC.2009.30
http://dbpedia.org/resource/James_Hendler
http://dbpedia.
org/resource/James_Hendler
doi:10.1109/MC.2009.30
http://purl.org/dc/elements/1.1/creator
http://purl.org/dc/elements/1.1/creator
http://dx.doi.org//10.1109/MC.2009.30
http://dbpedia.org/resource/James_Hendler
We're missing something...
● Check: URIs for names: S, P, O can be URIs
● Check: HTTP URIs: all of our examples are resolvable
● Now: "Return something useful" when we resolve URIs
○ How do we serialize RDF?
○ How do we retrieve RDF?
Let's go to the graph...
Source: Programming the Web
http://bit.ly/1aZwr40
"Raw" Triples
Via the W3C RDF VCalidator Service:
http://www.w3.org/RDF/Validator/
N-Triples...
<http://kiwitobes.com/toby.rdf#ts> <http://xmlns.com/foaf/0.1/homepage> <http://kiwitobes.com/>.
<http://kiwitobes.com/toby.rdf#ts> <http://xmlns.com/foaf/0.1/nick> "kiwitobes".
<http://kiwitobes.com/toby.rdf#ts> <http://xmlns.com/foaf/0.1/name> "Toby Segaran".
<http://kiwitobes.com/toby.rdf#ts> <http://xmlns.com/foaf/0.1/mbox> <mailto:toby@segaran.com>.
<http://kiwitobes.com/toby.rdf#ts> <http://xmlns.com/foaf/0.1/interest> <http://semprog.com>.
<http://kiwitobes.com/toby.rdf#ts> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.
1/Person>.
<http://kiwitobes.com/toby.rdf#ts> <http://xmlns.com/foaf/0.1/knows> _:jamie .
<http://kiwitobes.com/toby.rdf#ts> <http://xmlns.com/foaf/0.1/knows> <http://semprog.com/people/colin>.
_:jamie <http://xmlns.com/foaf/0.1/name> "Jamie Taylor".
_:jamie <http://xmlns.com/foaf/0.1/mbox> <mailto:jamie@semprog.com>.
_:jamie <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person>.
<http://semprog.com/people/colin> <http://xmlns.com/foaf/0.1/name> "Colin Evans".
<http://semprog.com/people/colin> <http://xmlns.com/foaf/0.1/mbox> <mailto:colin@semprog.com>.
<http://semprog.com/people/colin> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.
1/Person>.
<http://semprog.com> <http://www.w3.org/2000/01/rdf-schema#label> "Semantic Programming".
<http://semprog.com> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Document>.
N3...
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix semperp: <http://semprog.com/people/>.
@prefix tobes: <http://kiwitobes.com/toby.rdf#>.
tobes:ts a foaf:Person;
foaf:homepage <http://kiwitobes.com/>;
foaf:interest <http://semprog.com>;
foaf:knows semperp:colin,
[ a foaf:Person;
foaf:mbox <mailto:jamie@semprog.com>;
foaf:name "Jamie Taylor"];
foaf:mbox <mailto:toby@segaran.com>;
foaf:name "Toby Segaran";
foaf:nick "kiwitobes".
<http://semprog.com> a foaf:Document;
rdfs:label "Semantic Programming".
semperp:colin a foaf:Person;
foaf:mbox <mailto:colin@semprog.com>;
foaf:name "Colin Evans".
RDFa...
:
<div xmlns:foaf="http://xmlns.com/foaf/0.1/" about="http://kiwitobes.com/toby.rdf#ts" typeof="foaf:Person">
Name: <span property="foaf:name">Toby Segaran</span><br/>
Nickname: <span property="foaf:nick">kiwitobes</span><br/>
Interests: <a rel="foaf:interest" href="http://semprog.org">
<span property="rdfs:label">Semantic Programming</span></a>
Homepage: <a rel="foaf:homepage" href="http://kiwkitobes.com/">KiwiTobes</a><p/>
Friends:<br/>
<ul rel="foaf:knows">
<li about="http://semprog.com/people/colin"
typeof="foaf:Person" property="foaf:name">Colin Evans</li>
<li typeof="foaf:Person">
<span property="foaf:name">Jamie Taylor</span><br/>
Email: <a rel="foaf:mbox" href="mailto:jamie@semprog.com">
jamie@semprog.com</a><br/>
</li>
</ul>
</div>
:
RDF/XML... <rdf:RDF
xmlns:foaf='http://xmlns.com/foaf/0.1/'
xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
xmlns:rdfs='http://www.w3.org/2000/01/rdf-schema#'>
<foaf:Person rdf:about="http://kiwitobes.com/toby.rdf#ts">
<foaf:name>Toby Segaran</foaf:name>
<foaf:homepage rdf:resource="http://kiwitobes.com/"/>
<foaf:nick>kiwitobes</foaf:nick>
<foaf:mbox rdf:resource="mailto:toby@segaran.com"/>
<foaf:interest>
<foaf:Document rdf:about="http://semprog.com">
<rdfs:label>Semantic Programming</rdfs:label>
</foaf:Document>
</foaf:interest>
<foaf:knows>
<foaf:Person rdf:about="http://semprog.com/people/colin">
<foaf:name>Colin Evans</foaf:name>
<foaf:mbox rdf:resource="mailto:colin@semprog.com"/>
</foaf:Person>
</foaf:knows>
<foaf:knows>
<foaf:Person>
<foaf:name>Jamie Taylor</foaf:name>
<foaf:mbox rdf:resource="mailto:jamie@semprog.com"/>
</foaf:Person>
</foaf:knows>
</foaf:Person>
</rdf:RDF>
JSON-LD...
{
"@graph": [
{
"@id": "http://kiwitobes.com/toby.rdf#ts",
"@type": "http://xmlns.com/foaf/0.1/Person",
"http://xmlns.com/foaf/0.1/homepage": {
"@id": "http://kiwitobes.com/"
},
"http://xmlns.com/foaf/0.1/interest": {
"@id": "http://semprog.com"
},
"http://xmlns.com/foaf/0.1/knows": [
{
"@type": "http://xmlns.com/foaf/0.1/Person",
"http://xmlns.com/foaf/0.1/mbox": {
"@id": "mailto:jamie@semprog.com"
},
"http://xmlns.com/foaf/0.1/name": "Jamie Taylor"
},
{
"@id": "http://semprog.com/people/colin"
}
],
"http://xmlns.com/foaf/0.1/mbox": {
"@id": "mailto:toby@segaran.com"
},
"http://xmlns.com/foaf/0.1/name": "Toby Segaran",
"http://xmlns.com/foaf/0.1/nick": "kiwitobes"
},
{
"@id": "http://semprog.com/people/colin",
"@type": "http://xmlns.com/foaf/0.1/Person",
"http://xmlns.com/foaf/0.1/mbox": {
"@id": "mailto:colin@semprog.com"
},
"http://xmlns.com/foaf/0.1/name": "Colin Evans"
},
{
"@id": "http://semprog.com",
"@type": "http://xmlns.com/foaf/0.1/Document",
"http://www.w3.org/2000/01/rdf-schema#label": "Semantic
Programming"
}
]
}
RDF/JSON...
{
"http://kiwitobes.com/toby.rdf#ts": {
"http://xmlns.com/foaf/0.1/nick": [
{
"type": "literal",
"value": "kiwitobes"
}
],
"http://www.w3.org/1999/02/22-rdf-syntax-ns#type": [
{
"type": "uri",
"value": "http://xmlns.com/foaf/0.1/Person"
}
],
"http://xmlns.com/foaf/0.1/interest": [
{
"type": "uri",
"value": "http://semprog.com"
}
],
"http://xmlns.com/foaf/0.1/knows": [
{
"type": "uri",
"value": "http://semprog.com/people/colin"
},
{
"type": "bnode",
"value": "_:N40b366148cfc4c48a80f4e15acbd2858"
}
],
"http://xmlns.com/foaf/0.1/mbox": [
{
"type": "uri",
"value": "mailto:toby@segaran.com"
}
],
"http://xmlns.com/foaf/0.1/homepage": [
{
"type": "uri",
"value": "http://kiwitobes.com/"
}
],
"http://xmlns.com/foaf/0.1/name": [
{
"type": "literal",
"value": "Toby Segaran"
}
]
},
"http://semprog.com/people/colin": {
"http://xmlns.com/foaf/0.1/mbox": [
{
"type": "uri",
"value": "mailto:colin@semprog.com"
}
],
"http://xmlns.com/foaf/0.1/name": [
{
"type": "literal",
"value": "Colin Evans"
}
],
"http://www.w3.org/1999/02/22-rdf-syntax-ns#type": [
{
"type": "uri",
"value": "http://xmlns.com/foaf/0.1/Person"
}
]
},
"_:N40b366148cfc4c48a80f4e15acbd2858": {
"http://xmlns.com/foaf/0.1/mbox": [
{
"type": "uri",
"value": "mailto:jamie@semprog.com"
}
],
"http://xmlns.com/foaf/0.1/name": [
{
"type": "literal",
"value": "Jamie Taylor"
}
],
"http://www.w3.org/1999/02/22-rdf-syntax-
ns#type": [
{
"type": "uri",
"value": "http://xmlns.com/foaf/0.1/Person"
}
]
},
"http://semprog.com": {
"http://www.w3.org/1999/02/22-rdf-syntax-
ns#type": [
{
"type": "uri",
"value": "http://xmlns.com/foaf/0.
1/Document"
}
],
"http://www.w3.org/2000/01/rdf-schema#label": [
{
"type": "literal",
"value": "Semantic Programming"
}
]
}
}
RDF Serialization: Summary
● N-Triples: Verbose, "pedagogical"
● N3: Concise, in common use;
● RDFa: Commonly used for embedded RDF
● RDF/XML: Some use in government & "enterprise"
● JSON-LD: Fast-rising LD standard
● RDF/JSON: Older convention for LD applications
Things we still haven't discussed...
● How to retrieve this "linked data" of which I speak
● How (and where?) to query RDF "graphs"
● How to use LD in applications
● How to create visualizations & "mashups"
Also:
● How to create and publish linked data...
Consuming Linked Data
● Querying RDF: SPARQL
● Endpoints and triple stores
● "Mashing" data in the query
● Mashing data in the application
Anatomy of a Mashup
Demo: http://logd.tw.rpi.edu/demo/building-logd-visualizations/agi-per-capita-v2.html (use Firefox)
Deep Dive: LOGD Mashup Tutorial
1. Choose data (two datasets from Data.gov)
○ "State Library Agency Survey: Fiscal Year 2006"
○ "Tax Year 2007 County Income Data"
2. Define queries to retrieve desired results from endpoint
○ http://logd.tw.rpi.edu/demo/building-logd-visualizations/mashup-353-population-1356-agi.sparql
○ Submit this URI to http://logd.tw.rpi.edu/sparql
3. Define basic HTML layout
4. Insert visualization code (e.g. Google visualization)
5. Pass static data
○ http://logd.tw.rpi.edu/demo/building-logd-visualizations/mashup-353-population-1356-agi.js
6. Revise to pass dynamic data from live SPARQL queries
http://logd.tw.rpi.edu/tutorial/building_logd_visualizations
Choose Datasets from Data.gov
Converted RDF on TWC LOGD Portal
SPARQL: pattern matching over RDF
graphs
?s ?blackboard
dbpedia2:blackboard
http://bit.ly/RumkhW
?s ?blackboard
dbpedia2:blackboard
http://bit.ly/RumkhW
?s ?blackboard
dbpedia2:blackboard
http://bit.ly/RumkhW
Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)
Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)
http://dbpedia.org/resource/Double,_Double,_Boy_in_Trouble
LOGD Tutorial SPARQL Query
# this query returns the agi and population data from two data.gov datasets
SELECT distinct ?state_abbv ?agi ?population
WHERE {
GRAPH <http://logd.tw.rpi.edu/source/data-gov/dataset/353/version/1st-anniversary>{
?s1 <http://logd.tw.rpi.edu/source/data-gov/dataset/353/vocab/raw/popu_st>?population.
?s1 <http://logd.tw.rpi.edu/source/data-gov/dataset/353/vocab/raw/pub_fips> ?state_fipscode .
}
GRAPH <http://logd.tw.rpi.edu/source/data-gov/dataset/1356/version/2009-Dec-03> {
?s2 <http://logd.tw.rpi.edu/source/data-gov/dataset/1356/vocab/raw/state_abbrv>?state_abbv .
?s2 <http://logd.tw.rpi.edu/source/data-gov/dataset/1356/vocab/raw/county_code> "000" .
?s2 <http://logd.tw.rpi.edu/source/data-gov/dataset/1356/vocab/raw/agi>?agi.
?s2 <http://logd.tw.rpi.edu/source/data-gov/dataset/1356/vocab/raw/state_code> ?state_fipscode .
}
}
order by ?state_fipscode
SPARQL Query (results)
SPARQL Query results (JSON)
google.visualization.Query.setResponse({version:0.6,status:'ok',reqId:0,table:{cols:[{id:'state_abbv',
label:'state_abbv',type:'string'},{id:'agi',label:'agi',type:'number'},{id:'population',label:'population',
type:'number'}],rows:[{c:[{v:'AL'},{v:92162773},{v:4599030}]},{c:[{v:'AK'},{v:17312636},{v:670053}]},{c:
[{v:'AZ'},{v:134442007},{v:6166318}]},{c:[{v:'AR'},{v:49783294},{v:2810872}]},{c:[{v:'CA'},{v:913619942},{v:
36457549}]},{c:[{v:'CO'},{v:128175529},{v:4753377}]},{c:[{v:'CT'},{v:122697142},{v:3504809}]},{c:[{v:'DE'},
{v:22983204},{v:853476}]},{c:[{v:'DC'},{v:18177370},{v:581530}]},{c:[{v:'FL'},{v:429785960},{v:18089888}]},
{c:[{v:'GA'},{v:199864840},{v:9363941}]},{c:[{v:'HI'},{v:30592983},{v:1285498}]},{c:[{v:'ID'},{v:30292717},
{v:1466465}]},{c:[{v:'IL'},{v:339217881},{v:12831970}]},{c:[{v:'IN'},{v:140616570},{v:6313520}]},{c:
[{v:'IA'},{v:68946837},{v:2982085}]},{c:[{v:'KS'},{v:65216515},{v:2764075}]},{c:[{v:'KY'},{v:81721206},{v:
4206074}]},{c:[{v:'LA'},{v:84029967},{v:4287768}]},{c:[{v:'ME'},{v:28954363},{v:1321574}]},{c:[{v:'MD'},{v:
168647138},{v:5615727}]},{c:[{v:'MA'},{v:202226349},{v:6437193}]},{c:[{v:'MI'},{v:227233854},{v:10095643}]},
{c:[{v:'MN'},{v:143482070},{v:5167101}]},{c:[{v:'MS'},{v:47387966},{v:2910540}]},{c:[{v:'MO'},{v:131166510},
{v:5842713}]},{c:[{v:'MT'},{v:20045504},{v:944632}]},{c:[{v:'NE'},{v:41569440},{v:1768331}]},{c:[{v:'NV'},
{v:65272642},{v:2495529}]},{c:[{v:'NH'},{v:38175000},{v:1314895}]},{c:[{v:'NJ'},{v:283024874},{v:8724560}]},
{c:[{v:'NM'},{v:38144029},{v:1954599}]},{c:[{v:'NY'},{v:513598458},{v:19306183}]},{c:[{v:'NC'},{v:
195374554},{v:8856505}]},{c:[{v:'ND'},{v:14923738},{v:635867}]},{c:[{v:'OH'},{v:259099675},{v:11478006}]},
{c:[{v:'OK'},{v:70394493},{v:3579212}]},{c:[{v:'OR'},{v:85591882},{v:3700758}]},{c:[{v:'PA'},{v:313289892},
{v:12440621}]},{c:[{v:'RI'},{v:26532233},{v:1067610}]},{c:[{v:'SC'},{v:88615194},{v:4321249}]},{c:[{v:'SD'},
{v:17825580},{v:781919}]},{c:[{v:'TN'},{v:126270760},{v:6038803}]},{c:[{v:'TX'},{v:504386602},{v:
23507783}]},{c:[{v:'UT'},{v:55426179},{v:2550063}]},{c:[{v:'VT'},{v:15246152},{v:623908}]},{c:[{v:'VA'},{v:
217677476},{v:7642884}]},{c:[{v:'WA'},{v:175730868},{v:6395798}]},{c:[{v:'WV'},{v:32243697},{v:1818470}]},
{c:[{v:'WI'},{v:140516394},{v:5556506}]},{c:[{v:'WY'},{v:15216840},{v:515004}]}]}})
http://logd.tw.rpi.edu/demo/building-logd-visualizations/mashup-353-population-1356-agi.js
SPARQL Query results (JSON)
google.visualization.Query.setResponse({version:0.6,status:'ok',reqId:0,table:{cols:[{id:'state_abbv',
label:'state_abbv',type:'string'},{id:'agi',label:'agi',type:'number'},{id:'population',
label:'population',type:'number'}],rows:[{c:[{v:'AL'},{v:92162773},{v:4599030}]},{c:[{v:'AK'},{v:
17312636},{v:670053}]},{c:[{v:'AZ'},{v:134442007},{v:6166318}]},{c:[{v:'AR'},{v:49783294},{v:2810872}]},{c:
[{v:'CA'},{v:913619942},{v:36457549}]},{c:[{v:'CO'},{v:128175529},{v:4753377}]},{c:[{v:'CT'},{v:122697142},
{v:3504809}]},{c:[{v:'DE'},{v:22983204},{v:853476}]},{c:[{v:'DC'},{v:18177370},{v:581530}]},{c:[{v:'FL'},{v:
429785960},{v:18089888}]},{c:[{v:'GA'},{v:199864840},{v:9363941}]},{c:[{v:'HI'},{v:30592983},{v:1285498}]},
{c:[{v:'ID'},{v:30292717},{v:1466465}]},{c:[{v:'IL'},{v:339217881},{v:12831970}]},{c:[{v:'IN'},{v:
140616570},{v:6313520}]},{c:[{v:'IA'},{v:68946837},{v:2982085}]},{c:[{v:'KS'},{v:65216515},{v:2764075}]},{c:
[{v:'KY'},{v:81721206},{v:4206074}]},{c:[{v:'LA'},{v:84029967},{v:4287768}]},{c:[{v:'ME'},{v:28954363},{v:
1321574}]},{c:[{v:'MD'},{v:168647138},{v:5615727}]},{c:[{v:'MA'},{v:202226349},{v:6437193}]},{c:[{v:'MI'},
{v:227233854},{v:10095643}]},{c:[{v:'MN'},{v:143482070},{v:5167101}]},{c:[{v:'MS'},{v:47387966},{v:
2910540}]},{c:[{v:'MO'},{v:131166510},{v:5842713}]},{c:[{v:'MT'},{v:20045504},{v:944632}]},{c:[{v:'NE'},{v:
41569440},{v:1768331}]},{c:[{v:'NV'},{v:65272642},{v:2495529}]},{c:[{v:'NH'},{v:38175000},{v:1314895}]},{c:
[{v:'NJ'},{v:283024874},{v:8724560}]},{c:[{v:'NM'},{v:38144029},{v:1954599}]},{c:[{v:'NY'},{v:513598458},{v:
19306183}]},{c:[{v:'NC'},{v:195374554},{v:8856505}]},{c:[{v:'ND'},{v:14923738},{v:635867}]},{c:[{v:'OH'},{v:
259099675},{v:11478006}]},{c:[{v:'OK'},{v:70394493},{v:3579212}]},{c:[{v:'OR'},{v:85591882},{v:3700758}]},
{c:[{v:'PA'},{v:313289892},{v:12440621}]},{c:[{v:'RI'},{v:26532233},{v:1067610}]},{c:[{v:'SC'},{v:88615194},
{v:4321249}]},{c:[{v:'SD'},{v:17825580},{v:781919}]},{c:[{v:'TN'},{v:126270760},{v:6038803}]},{c:[{v:'TX'},
{v:504386602},{v:23507783}]},{c:[{v:'UT'},{v:55426179},{v:2550063}]},{c:[{v:'VT'},{v:15246152},{v:623908}]},
{c:[{v:'VA'},{v:217677476},{v:7642884}]},{c:[{v:'WA'},{v:175730868},{v:6395798}]},{c:[{v:'WV'},{v:32243697},
{v:1818470}]},{c:[{v:'WI'},{v:140516394},{v:5556506}]},{c:[{v:'WY'},{v:15216840},{v:515004}]}]}})
http://logd.tw.rpi.edu/demo/building-logd-visualizations/mashup-353-population-1356-agi.js
Defining HTML Layout
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>AGI per Capita Map</title>
</head>
<body>
<div>AGI per Capita Map: average adjusted gross income per person in
dollar amount in US states.</div>
<div id='map_canvas'>Loading Map ...</div>
</body>
</html>
http://logd.tw.rpi.edu/tutorial/building_logd_visualizations
Visualization Code...
1. Load the appropriate Google Visualization API packages (in this case, the GeoMap package).
2. Define a callback function for loading visualization code, which is called upon the loading of the HTML page.
3. Obtain data from a given source to pass to our GeoMap instance. The Google Visualization API is designed to accept data
in the form of specially-formatted JSON (represented by a URI) which can then be fed to a JSON processing function.
4. Following a call to the JSON processor, verify that it successfully processed the passed file.
5. Get back a response from the query processor, containing the data from the JSON file.
6. Define a data table to store the response data in. This process starts by defining header entries of the form TABLE.
addColumn(DATATYPE, NAME).
7. For each entry in the response, create a new data table row for the corresponding data.
8. Define a configuration for the GeoMap instance to be visualized, containing information such as resolution.
9. Define the GeoMap instance in the HTML div with id='map_canvas', using the configuration from Step 8 and data table
from Step 7.
http://logd.tw.rpi.edu/tutorial/building_logd_visualizations
Visualization Code...
1. Load the appropriate Google Visualization API packages (in this case, the GeoMap package).
2. Define a callback function for loading visualization code, which is called upon the loading of the HTML page.
3. Obtain data from a given source to pass to our GeoMap instance. The Google Visualization API is designed to accept data
in the form of specially-formatted JSON (represented by a URI) which can then be fed to a JSON processing function.
4. Following a call to the JSON processor, verify that it successfully processed the passed file.
5. Get back a response from the query processor, containing the data from the JSON file.
6. Define a data table to store the response data in. This process starts by defining header entries of the form TABLE.
addColumn(DATATYPE, NAME).
7. For each entry in the response, create a new data table row for the corresponding data.
8. Define a configuration for the GeoMap instance to be visualized, containing information such as resolution.
9. Define the GeoMap instance in the HTML div with id='map_canvas', using the configuration from Step 8 and data table
from Step 7.
http://logd.tw.rpi.edu/tutorial/building_logd_visualizations
Dynamic Visualizations
Loading data using SPARQL queries
//load data using SPARQL query
var sparqlproxy = "http://logd.tw.rpi.edu/ws/sparqlproxy.php";
var queryloc = "http://logd.tw.rpi.edu/demo/building-logd-visualizations/mashup-353-
population-1356-agi.sparql";
var service = "http://logd.tw.rpi.edu/sparql";
var queryurl = sparqlproxy
+ "?" + "output=gvds"
+ "&service-uri=" + encodeURIComponent(service)
+ "&query-uri=" + encodeURIComponent(queryloc) ;
Next: Building the Web of Data
● Converting datasets to RDF
● Hosting: Triplestores & endpoints
● Enterprise use cases
● Advanced techniques
● Web Science...
Part II: Building the Web of Data
Part II: Building the Web of Data
1. Review: the Web of Data
2. Publishing the Web of Data
3. Engineering the Web of Data in the Enterprise
4. Enterprise Applications of Semantic Technologies
5. Advanced "Semantic Web" concepts
6. Web Science: Observing and (re)Engineering the Web
First Principles of the Web...
● A standard system for identifying resources
● Standard formats for representing resources
● A standard protocol for exchanging resources
Relevant core standards:
● URIs (URLs): Uniform Resource Identifiers
● HTML: Hypertext Markup Language
● HTTP: Hypertext Transfer Protocol
Review: Web Architecture
Review: Linked Data Principles
Tim Berners-Lee http://www.w3.org/DesignIssues/LinkedData.html
● Use URIs as names for things
● Use HTTP URIs so people can look up those names
● When someone looks up a URI, return useful information
○ use standard representation formats to express it
● Include links to other URIs, so consumers can discover more things
○ By "consumers" we mean people or applications
Now: Publishing the Web of Data
Recall our triples from previous lecture...
Publishing the Web of Data
Recall our triples from previous lecture...
To be useful, this data must be
loaded in a triple store and
published via a web-accessible
SPARQL endpoint
Industrial-strength Triple stores
1. AllegroGraph (1+Trillion)
2. OpenLink Virtuoso v6.1 - 15.4B+ explicit; uncounted virtual/inferred
3. BigOWLIM (12B explicit, 20B total); 100,000 queries per $1
4. Garlik 4store (15B)
5. Bigdata(R) (12.7B)
6. YARS2 (7B)
7. Jena TDB (1.7B)
8. Jena SDB (650M)
9. Mulgara (500M)
10. RDF gateway (262M)
11. Jena with PostgreSQL (200M)
12. Kowari (160M)
13. 3store with MySQL 3 (100M)
14. Sesame (70M)
15. Others who claim to go big
TWC uses Virtuoso Open Source edition
Industrial-strength Triple stores
1. AllegroGraph (1+Trillion)
2. OpenLink Virtuoso v6.1 - 15.4B+ explicit; uncounted virtual/inferred
3. BigOWLIM (12B explicit, 20B total); 100,000 queries per $1
4. Garlik 4store (15B)
5. Bigdata(R) (12.7B)
6. YARS2 (7B)
7. Jena TDB (1.7B)
8. Jena SDB (650M)
9. Mulgara (500M)
10. RDF gateway (262M)
11. Jena with PostgreSQL (200M)
12. Kowari (160M)
13. 3store with MySQL 3 (100M)
14. Sesame (70M)
15. Others who claim to go big
You can install Apache Jena yourselves!
Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)
Publishing: RDBMS to RDF
● Advantage: Leveraging "legacy"
sources
● Challenge: Complexity...
● Example: D2RQ Platform
○ D2RQ Mapping Language
○ D2RQ Engine
○ D2R Server
See also: http://d2rq.org/
Publishing: Linked Data API
Motivations:
● SPARQL, RDF have high learning curves
● RDF support in the common web
development tool stacks is scarce
● Solution is the Linked Data API
Advantages:
● Easy to use web API on linked data
● Allows publisher to provide URIs for lists of
things
● Allows users to get back the data as JSON,
XML, or RDF
● Easy to filter data using simple URL query
parameters
Makes it easy to create web applications over the
published data using standard tools
https://code.google.com/p/linked-data-api/wiki/Specification
Linked Data
API Example
UK Bathing Water Data Explorer
Live: http://environment.data.gov.uk/lab/bwq-web
Details: http://www.epimorphics.com/web/projects/bathing-water-quality
Architecture of Linked Data Applications
● The Crawling Pattern
● The On-The-Fly Dereferencing Pattern
● The Query Federation Pattern
The architecture of a Linked Data application
depends on its driving use case
The Crawling Pattern
● Applications "crawl" the Web of Data in advance by traversing RDF links
● Integrate and cleanse discovered data
● Provide higher layers of the application with an integrated view of the original data
● Mimics the architecture of classical Web search engines like Google and Yahoo
● Suitable for implementing applications on top of an open, growing set of sources
○ new data sources are discovered by the crawler at run-time.
● Separates the tasks of building up the cache and using cache later
○ enables applications to execute complex queries with reasonable performance over large
amounts of data
Disadvantages:
● Data is replicated
● Applications may work with stale data; crawler only re-crawls sources at certain intervals
The crawling pattern is implemented by Linked Data search engines
"Crawling Pattern" in the Wild
Google Rich Snippets
The On-The-Fly Dereferencing Pattern
● URIs are dereferenced and links are followed the moment the application requires the data
● Applications never process stale data
Disadvantages:
● More complex operations are very slow as they might involve dereferencing thousands of URIs
in the background
● Architectures have been proposed for answering complex queries over the Web of Data by
relying on on-the-fly dereferencing pattern
● Results show that data currency and a very high degree of completeness are achieved at the
price of very slow query execution
The crawling pattern is implemented by Linked Data browsers
"On-the-fly" examples
● Our previous example (dynamic version)
● Tabulator, Marbles
The Query Federation Pattern
● Relies on sending complex queries (or parts) directly to a fixed set of data sources.
● Useful if data sources provide SPARQL endpoints in addition to serving their data on the Web
via dereferenceable URIs
● Enables applications to work with current data without needing to replicate complete data
sources locally
Disadvantages:
● Finding performant query execution plans for join queries over larger numbers of data sources
is complex (i.e. a research topic)
● Query performance slows down significantly when number of data sources grows
● Query federation pattern should only be used in situations where the number of data sources
is known to be small
Applications could follow links between data sources, examine voiD descriptions provided by these
data sources and then include data sources which provide SPARQL endpoints into their list of targets
for federated queries
http://linkeddatabook.com/editions/1.0/
Query Federation Example
SELECT ?birthDate ?spouseName ?movieTitle ?
movieDate {
{ SERVICE <http://dbpedia.org/sparql>
{ SELECT ?birthDate ?spouseName WHERE {
?actor rdfs:label "Arnold Schwarzenegger"
@en ;
dbpo:birthDate ?birthDate ;
dbpo:spouse ?spouseURI .
?spouseURI rdfs:label ?spouseName .
FILTER ( lang(?spouseName) = "en" )
} } }
{ SERVICE <http://data.linkedmdb.org/sparql>
{ SELECT ?actor ?movieTitle ?movieDate WHERE {
?actor imdb:actor_name "Arnold
Schwarzenegger".
?movie imdb:actor ?actor ;
dcterms:title ?movieTitle ;
dcterms:date ?movieDate .
} } } }
Application Code
Federated
SPARQL Service
DBPedia.org LinkedMDB.org
e.g. Jena ARQ
See also bobdc.org http://bit.ly/HLdQ4S
Enterprise Use Cases
http://www.w3.org/2001/sw/sweo/public/UseCases/
Enterprise Use Cases
http://www.w3.org/2001/sw/sweo/public/UseCases/
"Enterprise Energy Intelligence" (DERI)
"A Semantic Web Content Repository for Clinical Research"
(Cleveland Clinic)
Cleveland Clinic Use Case
● Improve the Clinic’s ability to use patient data for generating new
knowledge to improve future patient care through outcomes-based and
longitudinal clinical research.
● Leverage expressiveness and versatility of formats to provide
individual patients an appropriate terminology and accessible view of
summary data.
● Over 4 years, Cleveland Clinic has developed a representational
methodology for bridging data collection, document management, and
knowledge representation.
● The result is a unified content repository called SemanticDB.
http://www.w3.org/2001/sw/sweo/public/UseCases/ClevelandClinic/
Cleveland Clinic Use Case
● SemanticDB internally deployed for production on top of an open source
XML & RDF content repository and Firefox (with extensions).
● Methodology realized through a core set of terms that facilitate creation of
a domain vocabulary (or domain model)
○ instances of the vocabulary managed automatically by the system.
● Patient records available as both uniform, structured markup and RDF.
● Coordinated use of both representation languages enables a variety of
operations on patient record:
○ form-based data entry, transformation to reporting formats, document
validation, targeted inference, and querying
○ Operations can be dispatched on the patient record documents and
RDF graphs over a uniform set of interfaces.
http://www.w3.org/2001/sw/sweo/public/UseCases/ClevelandClinic/
http://www.w3.org/2001/sw/sweo/public/UseCases/ClevelandClinic/
http://www.w3.org/2001/sw/sweo/public/UseCases/ClevelandClinic/
http://www.w3.org/2001/sw/sweo/public/UseCases/ClevelandClinic/
http://www.w3.org/2001/sw/sweo/public/UseCases/ClevelandClinic/
DERI "Enterprise Energy Intelligence"
http://dgsit.deri.ie/?q=node/15 and http://slidesha.re/195Gnrr
http://slidesha.re/195Gnrr
http://slidesha.re/195Gnrr
http://slidesha.re/195Gnrr
http://slidesha.re/195Gnrr
Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)
More about "Enterprise Linked Data"
Part I: Why Link Enterprise Data?
● Semantic Web and the Linked Data Enterprise
● The Role of Community-Driven Data Curation for Enterprises
Part II: Approval and Support of Linked Data Projects
● Preparing for a Linked Data Enterprise
● Selling and Building Linked Data: Drive Value and Gain Momentum
Part III: Techniques for Linking Enterprise Data
● Enhancing Enterprise 2.0 Ecosystems Using Semantic Web and Linked Data
Technologies
● Linking XBRL Financial Data
● Scalable Reasoning Techniques for Semantic Enterprise Data
● Reliable and Persistent Identification of Linked Data Elements
Part IV: Success Stories
● Linked Data for Fighting Global Hunger
● Enterprise Linked Data as Core Business Infrastructure
● Standardizing Legal Content with OWL and RDF
● A Role for Semantic Web Technologies in Patient Record Data Collection
● Use of Semantic Web technologies on the BBC Web Sites
http://3roundstones.com/led_book/led-contents.html
● Vocabulary design/RDFS
● Knowledge Organization
● Ontology design
● Provenance
● Inference
Advanced Concepts
● Vocabulary design/RDFS
● Knowledge Organization
● Ontology design
● Provenance
● Inference
Advanced Concepts
For a poetic (and humorous!) consideration of the
evolution of the "Semantic Layer Cake" see:
Jim Hendler, "My Take on the Semantic Web Layer
Cake." http://bit.ly/195L70i
http://bit.ly/195LrMz
Inference: Discovering New Relationships
On the Semantic Web, data is modeled as a set of (named) relationships between resources
● Inference means using automatic procedures to generate new relationships
○ based on the data...
○ ...and some additional information in the form of a vocabulary or a set of rules
● The new relationships may explicitly added to the set of data, or may be returned at query
time (implementation issue)
● The source of additional information is defined through vocabularies or rule sets
● Both approaches draw upon knowledge representation techniques
○ Ontologies provide classification methods, putting an emphasis on defining 'classes',
'subclasses', on how individual resources can be associated to such classes, and
characterizing the relationships among classes and their instances
○ Rules define mechanisms for discovering and generating new relationships based on
existing ones, much like logic programs (Prolog)
● In the Semantic Web toolkit, RDFS, OWL, or SKOS are used for defining ontologies
○ RIF covers rule based approaches
Vocabulary Design: W3C RDFS (1)
● RDF Vocabulary Description Language
● RDF has no mechanism for:
○ describing properties
○ describing the relationships between properties and other resources
● RDF Schema defines classes and properties for describing classes,
properties and other resources
● RDF Schema vocabulary descriptions are written in RDF
http://www.w3.org/TR/2004/REC-rdf-schema-20040210/
RDF Schema: Classes
● rdfs:Resource ...is the class of everything
● rdfs:Class ...declares a resource as a class for other resources
● rdfs:Literal ...literal values such as strings and integers
● rdfs:Datatype ...the class of datatypes
● rdf:XMLLiteral ...the class of XML literal values
● rdf:Property ...the class of properties
http://www.w3.org/TR/2004/REC-rdf-schema-20040210/
RDF Schema: Properties
● rdfs:domain ...declares the class of the subject in a triple whose second component is the predicate.
● rdfs:range ...declares the class or datatype of the object in a triple whose second part is the predicate
○ ex:employer rdfs:domain foaf:Person
○ ex:employer rdfs:range foaf:Organization
● rdf:type ...state that resource is an instance of a class
● rdfs:subClassOf ...allows to declare hierarchies of classes.
○ e.g. "Every Person is an Agent": foaf:Person rdfs:subClassOf foaf:Agent
● rdfs:subPropertyOf ...states that all resources related by one property are also related by another
● rdfs:label ...used to provide a human-readable version of a resource's name
● rdfs:comment ...provides a human-readable description of a resource
● rdfs:seeAlso ...indicates a resource that might provide additional information about the subject resource.
● rdfs:isDefinedBy ...indicates a resource defining the subject resource
http://www.w3.org/TR/2004/REC-rdf-schema-20040210/
Knowledge Organization 1: W3C OWL
● Web Ontology Language
● RDFS too weak to describe resources in sufficient detail
○ No localised range and domain constraints
■ Can’t say that the range of hasChild is person when applied to persons and elephant
when applied to elephants
○ No existence/cardinality constraints
■ Can’t say that all instances of person have a mother that is also a person, or that
persons have exactly 2 parents
○ No transitive, inverse or symmetrical properties
■ Can’t say that isPartOf is a transitive property, that hasPart is the inverse of isPartOf
or that touches is symmetrical
● Difficult to provide reasoning support
○ No “native” reasoners for non-standard semantics
http://www.w3.org/2007/OWL/wiki/OWL_Working_Group or http://bit.ly/195WANj
Knowledge Organization 1: W3C OWL
Desirable features identified for Web Ontology Language:
● Extends existing Web standards
○ Such as XML, RDF, RDFS
● Easy to understand and use
○ Should be based on familiar KR* idioms
● Formally specified
● Of “adequate” expressive power
● Possible to provide automated reasoning support
KR* = knowledge representation
http://www.w3.org/2007/OWL/wiki/OWL_Working_Group or http://bit.ly/195WANj or http://bit.ly/1960964
OWL Tools: Protege-OWL Editor
http://protege.stanford.edu/overview/protege-owl.html
Knowledge Organization 2: W3C SKOS
● Simple Knowledge Organization System
● An application of RDFS and OWL
● Provides a way to represent controlled vocabularies, taxonomies and
thesauri
○ controlled vocabulary: a list of terms which a community or
organization has agreed upon
○ taxonomy: a controlled vocabulary organized in a hierarchy
○ thesaurus: a taxonomy with more information about each concept
including preferred and alternative terms.
○ A thesaurus may also contain relationships to related concepts
● SKOS is an OWL ontology; it can be written out in any RDF syntax
http://www.w3.org/2004/02/skos/ or http://slidesha.re/1etWDue or http://bit.ly/1etYLlE
Provenance: The W3C PROV Model
● A set of W3C recommendations and notes on modelling provenance
● PROV-O is the "core..."
http://www.w3.org/TR/prov-primer/
Provenance in a Nutshell
● prov:Entity is a physical, digital, conceptual, or other kind
of thing with some fixed aspects; entities may be real or
imaginary
● prov:Activity is something that occurs over a period of
time and acts upon or with entities; it may include
consuming, processing, transforming, modifying,
relocating, using, or generating entities
● prov:Agent is something that bears some form of
responsibility for an activity taking place, for the
existence of an entity, or for another agent's
activity
These three classes provide a basis for the rest of PROV-O
http://www.w3.org/TR/prov-primer/ or http://www.provbook.org/
Inference and W3C RIF
● Production Rules
○ Analogous to instruction in a program: If a certain condition holds, then some action is
carried out
○ Example: "If a customer has flown more than 100,000 miles, then upgrade him to Gold
Member status."
● Declarative Rules
○ Stating a fact about the world
○ Understood as sentences of the form "If P, then Q"
○ Example: "If a person is currently president of the United
States of America, then his or her current residence is the
White House."
● There are many rule systems, esp. in the expert systems domain
● The W3C Rule Interchange Format is an interchange format
between existing rule systems
http://www.w3.org/TR/2013/NOTE-rif-primer-20130205/
The Future....
The Future (from the past)
What is Web Science?
● Positions the World Wide Web as an object of scientific
study unto itself
● Recognizes the Web as a transformational, disruptive
technology
● Its practitioners focus on understanding the Web...
○ ...its components, facets and characteristics
● The Web Science Method: “the process of designing
things in a very large space..."
What does Web Science ask?
● What processes have driven the Web’s growth?
○ Will they persist?
● How does large-scale structure emerge from a simple set of
protocols?
● How does the Web function as a socio-technical system?
● What drives the viral uptake of certain Web phenomena?
Bottom line: What might fragment the Web?
Clare Hooper, et.al. http://bit.ly/R813sC
To Probe Further...
● TWC Linking Open Government Data Portal http://logd.tw.rpi.edu
○ Esp: Linking Open Government Data Tutorials: http://logd.tw.rpi.edu/tutorials
● Heath & Bizer, "Linked Data." http://bit.ly/1dxKxNe
● Cambridge Semantics, Semantic University http://bit.ly/1cvy9Mv
● David Wood, "Intro to Linked Data: Modelling" http://slidesha.re/HUmihT
● David Wood, "Intro to Linked Data: Context" http://slidesha.re/1fGhQPv
● David Wood, "Intro to Linked Data: SPARQL" http://slidesha.re/1eUd8Qz
● Rob Stiles, "Linked Data, RDF and SPARQL" http://slidesha.re/17xVIqq
● Ivan Herman, "An Introduction to Semantic Web and Linked Data" http://slidesha.re/1aHREyv
● "Linked Data for the Enterprise" http://slidesha.re/1cvyqyS
● "Semantic Enterprise 2.0" http://slidesha.re/19vkl5u
● "Smart Enterprises" http://slidesha.re/1aXlncX
● "Linked data management" http://slidesha.re/19pHmTw GOOD!
● "Enterprise Data Meets Web Data" http://slidesha.re/1ifx9AU
● DERI, "Enterprise Energy Management using a Linked Dataspace for Energy Intelligence" http://slidesha.re/195Gnrr
● "Enhancement and Integration of Corporate Social Software Using the Semantic Web" http://bit.ly/1fGi7BW
● "Enabling Semantic Web technologies in the Enterprise 2.0 environment" http://slidesha.re/19vkl5u
● Workshop on Enterprise Semantic Web http://www.wasabi-ws.org/ esp: http://bit.ly/18zMhlp
● Best Buy examples (Jay Myers) http://www.slideshare.net/jaymmyers
● "Querying Semantic Web Databases" http://bit.ly/17vyXy9
● From "Big Data" to "Smart Data" (e.g. Ontotext example) http://slidesha.re/16s1iI7
● "How to publish linked data on the web" http://bit.ly/1cvzcfe
○ Supersceded by: " Linked Data: Evolving the Web into a Global Data Space" http://linkeddatabook.com/editions/1.
0/
● "Practical Cross-dataset Queries on the Web of Data" http://slidesha.re/1ifxsvy

More Related Content

PDF
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)
Rensselaer Polytechnic Institute
 
PDF
Engineering a Semantic Web (Spring 2018)
Rensselaer Polytechnic Institute
 
PPTX
Get on the Linked Data Web!
Armin Haller
 
PDF
Linked Data Tutorial
Michael Hausenblas
 
PDF
An introduction to Semantic Web and Linked Data
Gabriela Agustini
 
PPT
Linked Data Tutorial
Sören Auer
 
PPT
Linked library data
Jindřich Mynarz
 
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)
Rensselaer Polytechnic Institute
 
Engineering a Semantic Web (Spring 2018)
Rensselaer Polytechnic Institute
 
Get on the Linked Data Web!
Armin Haller
 
Linked Data Tutorial
Michael Hausenblas
 
An introduction to Semantic Web and Linked Data
Gabriela Agustini
 
Linked Data Tutorial
Sören Auer
 
Linked library data
Jindřich Mynarz
 

What's hot (20)

PDF
From the Semantic Web to the Web of Data: ten years of linking up
Davide Palmisano
 
PPT
Library Linked Data and the Future of Bibliographic Control
University of Toronto Libraries - Information Technology Services
 
PDF
Linked data as a library data platform
Jindřich Mynarz
 
PPTX
Usage of Linked Data: Introduction and Application Scenarios
EUCLID project
 
PDF
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Ontotext
 
PPTX
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
Armin Haller
 
PDF
Introduction to linked data
Laura Po
 
PDF
DBpedia/association Introduction The Hague 12.2.2016
Sebastian Hellmann
 
PPT
Linking library data
Jindřich Mynarz
 
PPTX
Linked Data: A short(-ish) introduction
Pete Johnston
 
PDF
Quick Linked Data Introduction
Michael Hausenblas
 
PDF
Linking Open Government Data at Scale
Bernadette Hyland-Wood
 
PPTX
Introduction to the Semantic Web
Tomek Pluskiewicz
 
PPTX
Madrid Building blocks of Linked Data
Victor de Boer
 
PDF
DBpedia Tutorial - Feb 2015, Dublin
m_ackermann
 
PPTX
Consuming Linked Data SemTech2010
Juan Sequeda
 
PDF
WWW2014 Overview of W3C Linked Data Platform 20140410
Arnaud Le Hors
 
ODP
Linked Data
cyriacsmail
 
ODP
DBpedia: A Public Data Infrastructure for the Web of Data
Sebastian Hellmann
 
PPT
The Semantic Web
ostephens
 
From the Semantic Web to the Web of Data: ten years of linking up
Davide Palmisano
 
Library Linked Data and the Future of Bibliographic Control
University of Toronto Libraries - Information Technology Services
 
Linked data as a library data platform
Jindřich Mynarz
 
Usage of Linked Data: Introduction and Application Scenarios
EUCLID project
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Ontotext
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
Armin Haller
 
Introduction to linked data
Laura Po
 
DBpedia/association Introduction The Hague 12.2.2016
Sebastian Hellmann
 
Linking library data
Jindřich Mynarz
 
Linked Data: A short(-ish) introduction
Pete Johnston
 
Quick Linked Data Introduction
Michael Hausenblas
 
Linking Open Government Data at Scale
Bernadette Hyland-Wood
 
Introduction to the Semantic Web
Tomek Pluskiewicz
 
Madrid Building blocks of Linked Data
Victor de Boer
 
DBpedia Tutorial - Feb 2015, Dublin
m_ackermann
 
Consuming Linked Data SemTech2010
Juan Sequeda
 
WWW2014 Overview of W3C Linked Data Platform 20140410
Arnaud Le Hors
 
Linked Data
cyriacsmail
 
DBpedia: A Public Data Infrastructure for the Web of Data
Sebastian Hellmann
 
The Semantic Web
ostephens
 
Ad

Similar to Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014) (20)

PDF
ITWS Capstone: Engineering a Semantic Web (Fall 2022)
Rensselaer Polytechnic Institute
 
PDF
Intro to Web Science (Oct 2022)
Rensselaer Polytechnic Institute
 
PDF
Intro to Web Science (Fall 2013)
Rensselaer Polytechnic Institute
 
PPTX
Linked data HHS 2015
Cason Snow
 
PPSX
Linked Data to Improve the OER Experience
The Open Education Consortium
 
ODP
Linked Data
Danny Ayers
 
PPTX
Linked Data and Locah, UKSG2011
Jane Stevenson
 
ODP
Quick Introduction to the Semantic Web, RDFa & Microformats
University of California, San Diego
 
PPTX
Linked Open Data Fundamentals for Libraries, Archives and Museums
trevorthornton
 
PPTX
Linked Data for Czech Legislation
Martin Necasky
 
PPTX
Linked Energy Data Generation
Filip Radulovic
 
PPTX
Linked Data and Libraries: What? Why? How?
Emily Nimsakont
 
PPT
Information Extraction and Linked Data Cloud
Dhaval Thakker
 
PPTX
Linked dataresearch
Tope Omitola
 
PPTX
Linked data MLA 2015
Cason Snow
 
PPTX
Linked Data MLA 2015
Cason Snow
 
PPT
Lifting the Lid on Linked Data
Jane Stevenson
 
PDF
Publishing Linked Data using Schema.org
DESTIN-Informatique.com
 
PDF
Adventures in Linked Data Land (presentation by Richard Light)
jottevanger
 
PPTX
Hacia la Internet del Futuro: Web Semántica y Open Linked Data, Parte 2
Diego López-de-Ipiña González-de-Artaza
 
ITWS Capstone: Engineering a Semantic Web (Fall 2022)
Rensselaer Polytechnic Institute
 
Intro to Web Science (Oct 2022)
Rensselaer Polytechnic Institute
 
Intro to Web Science (Fall 2013)
Rensselaer Polytechnic Institute
 
Linked data HHS 2015
Cason Snow
 
Linked Data to Improve the OER Experience
The Open Education Consortium
 
Linked Data
Danny Ayers
 
Linked Data and Locah, UKSG2011
Jane Stevenson
 
Quick Introduction to the Semantic Web, RDFa & Microformats
University of California, San Diego
 
Linked Open Data Fundamentals for Libraries, Archives and Museums
trevorthornton
 
Linked Data for Czech Legislation
Martin Necasky
 
Linked Energy Data Generation
Filip Radulovic
 
Linked Data and Libraries: What? Why? How?
Emily Nimsakont
 
Information Extraction and Linked Data Cloud
Dhaval Thakker
 
Linked dataresearch
Tope Omitola
 
Linked data MLA 2015
Cason Snow
 
Linked Data MLA 2015
Cason Snow
 
Lifting the Lid on Linked Data
Jane Stevenson
 
Publishing Linked Data using Schema.org
DESTIN-Informatique.com
 
Adventures in Linked Data Land (presentation by Richard Light)
jottevanger
 
Hacia la Internet del Futuro: Web Semántica y Open Linked Data, Parte 2
Diego López-de-Ipiña González-de-Artaza
 
Ad

More from Rensselaer Polytechnic Institute (6)

PDF
ITWS Capstone (RPI, Fall 2013)
Rensselaer Polytechnic Institute
 
PDF
ITWS Capstone Lecture (Spring 2013)
Rensselaer Polytechnic Institute
 
PDF
The Semantic Web: RPI ITWS Capstone (Fall 2012)
Rensselaer Polytechnic Institute
 
PDF
First they have to find it: Getting Open Government Data Discovered and Used
Rensselaer Polytechnic Institute
 
PDF
Where is the World is my Open Government Data?
Rensselaer Polytechnic Institute
 
PDF
The Future of DSpace: Making it Personal (Making it Social)
Rensselaer Polytechnic Institute
 
ITWS Capstone (RPI, Fall 2013)
Rensselaer Polytechnic Institute
 
ITWS Capstone Lecture (Spring 2013)
Rensselaer Polytechnic Institute
 
The Semantic Web: RPI ITWS Capstone (Fall 2012)
Rensselaer Polytechnic Institute
 
First they have to find it: Getting Open Government Data Discovered and Used
Rensselaer Polytechnic Institute
 
Where is the World is my Open Government Data?
Rensselaer Polytechnic Institute
 
The Future of DSpace: Making it Personal (Making it Social)
Rensselaer Polytechnic Institute
 

Recently uploaded (20)

PPT
Introduction to dns domain name syst.ppt
MUHAMMADKAVISHSHABAN
 
PPTX
The Internet of Things (IoT) refers to a vast network of interconnected devic...
chethana8182
 
PDF
LB# 820-1889_051-7370_C000.schematic.pdf
matheusalbuquerqueco3
 
PPTX
The Internet of Things (IoT) refers to a vast network of interconnected devic...
chethana8182
 
PPTX
LESSON-2-Roles-of-ICT-in-Teaching-for-learning_123922 (1).pptx
renavieramopiquero
 
PPTX
Parallel & Concurrent ...
yashpavasiya892
 
PDF
UI/UX Developer Guide: Tools, Trends, and Tips for 2025
Penguin peak
 
PPTX
B2B_Ecommerce_Internship_Simranpreet.pptx
LipakshiJindal
 
PPTX
Microsoft PowerPoint Student PPT slides.pptx
Garleys Putin
 
PPT
1965 INDO PAK WAR which Pak will never forget.ppt
sanjaychief112
 
PDF
DNSSEC Made Easy, presented at PHNOG 2025
APNIC
 
PPT
Transformaciones de las funciones elementales.ppt
rirosel211
 
PPTX
Different Generation Of Computers .pptx
divcoder9507
 
PDF
The Internet of Things (IoT) refers to a vast network of interconnected devic...
chethana8182
 
PPTX
Google SGE SEO: 5 Critical Changes That Could Wreck Your Rankings in 2025
Reversed Out Creative
 
PPTX
AI ad its imp i military life read it ag
ShwetaBharti31
 
PPTX
dns domain name system history work.pptx
MUHAMMADKAVISHSHABAN
 
PDF
KIPER4D situs Exclusive Game dari server Star Gaming Asia
hokimamad0
 
PPTX
Perkembangan Perangkat jaringan komputer dan telekomunikasi 3.pptx
Prayudha3
 
PPTX
办理方法西班牙假毕业证蒙德拉贡大学成绩单MULetter文凭样本
xxxihn4u
 
Introduction to dns domain name syst.ppt
MUHAMMADKAVISHSHABAN
 
The Internet of Things (IoT) refers to a vast network of interconnected devic...
chethana8182
 
LB# 820-1889_051-7370_C000.schematic.pdf
matheusalbuquerqueco3
 
The Internet of Things (IoT) refers to a vast network of interconnected devic...
chethana8182
 
LESSON-2-Roles-of-ICT-in-Teaching-for-learning_123922 (1).pptx
renavieramopiquero
 
Parallel & Concurrent ...
yashpavasiya892
 
UI/UX Developer Guide: Tools, Trends, and Tips for 2025
Penguin peak
 
B2B_Ecommerce_Internship_Simranpreet.pptx
LipakshiJindal
 
Microsoft PowerPoint Student PPT slides.pptx
Garleys Putin
 
1965 INDO PAK WAR which Pak will never forget.ppt
sanjaychief112
 
DNSSEC Made Easy, presented at PHNOG 2025
APNIC
 
Transformaciones de las funciones elementales.ppt
rirosel211
 
Different Generation Of Computers .pptx
divcoder9507
 
The Internet of Things (IoT) refers to a vast network of interconnected devic...
chethana8182
 
Google SGE SEO: 5 Critical Changes That Could Wreck Your Rankings in 2025
Reversed Out Creative
 
AI ad its imp i military life read it ag
ShwetaBharti31
 
dns domain name system history work.pptx
MUHAMMADKAVISHSHABAN
 
KIPER4D situs Exclusive Game dari server Star Gaming Asia
hokimamad0
 
Perkembangan Perangkat jaringan komputer dan telekomunikasi 3.pptx
Prayudha3
 
办理方法西班牙假毕业证蒙德拉贡大学成绩单MULetter文凭样本
xxxihn4u
 

Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)

  • 1. Engineering a Semantic Web ITWS Capstone (Spring 2014) John S. Erickson, Ph.D ❇ Tetherless World Constellation ❇ RPI
  • 3. Objectives ● Deeper understanding of Web architecture ● Understand the Semantic Web stack ● Linked Data principles & practices ● Build cool applications ● Contribute to the Web of Data ● Make better IT management decisions
  • 4. First: Consuming the Web of Data 1. From the Web of Documents to the Web of Data 2. Linked Data: Building Blocks of the Web (of data) 3. Mashups: Consuming Linked Data
  • 5. Web Architecture First Principles of the Web... ● A standard system for identifying resources ● Standard formats for representing resources ● A standard protocol for exchanging resources Relevant core standards: ● URIs (URLs): Uniform Resource Identifiers ● HTML: Hypertext Markup Language ● HTTP: Hypertext Transfer Protocol
  • 6. Architecture of the World Wide Web, Volume One http://www.w3.org/TR/webarch/
  • 7. Data Mining: Mapping the Blogosphere http://bit.ly/18MuXdD
  • 8. Identifying Web Resources (1) ● A global identification system is essential ○ to share information about resources ○ to reason about resources ○ to modify or exchange resources ● Resources are anything that can be linked to or spoken of ○ Documents, cat videos, people, ideas... ● Not all resources are "on" the Web ○ They might be referenced from the Web... ○ ...while not being retrievable from it ○ These are (so called) "information resources" Les Carr, et.al. http://slidesha.re/142MFrV
  • 9. Identifying Web Resources (2) ● A global standard is required; the URI is it ● Others systems are possible... ○ ...but added value of a single global system of identifiers is high ○ Enables linking, bookmarking and other functions across heterogeneous applications ● How are URI used? ○ All resources have URIs associated with them ○ Each URI identifies a single resource in a context- independent manner ○ URIs act as names and (usually) addresses ○ In general URIs are "opaque" Uniform Resource Identifier (URI): Generic Syntax (RFC 3986) http://www.ietf.org/rfc/rfc3986.txt
  • 10. Identifying Web Resources (3) ● "URIs identify and URLs locate..." ○ ...and identify ● URLs are URIs aligned with protocols ○ URLs include the "access mechanism" or "network location", e.g. http: // or ftp:// ○ How to "dereference" the URI and retrieve the thing ● URL examples ○ ftp://ftp.is.co.za/rfc/rfc1808.txt ○ http://www.ietf.org/rfc/rfc2396.txt ○ mailto:[email protected] ○ telnet://192.0.2.16:80/ Uniform Resource Identifier (URI): Generic Syntax (RFC 3986) http://www.ietf.org/rfc/rfc3986.txt
  • 11. Representing Resources (1) ● Resources are manifest as digital files ○ More precisely: serializations that look like files... ● The Web recognizes a (growing) set of {file | serialization} formats ○ The original and workhorse is HTML... ○ ...but there are many others ● Retrievable resources on the web serve multiple purposes ○ Resources encode information and data ○ Resources aggregate links to other resources ● This is what makes The Web(tm) a "web..."
  • 12. Resources (nodes) aggregate links to other resources to create a Web
  • 13. Retrieving Resources (1) ● Review: URLs refer to retrievable resources ○ ie URIs that specify some protocol for retrieval ● The original and most common Web protocol is HTTP ● Specialized protocols are possible but resources may appear "off the grid..." ● More common case is HTTP w different formats...
  • 14. URIs, HTTP, many formats... http://www.w3.org/2006/Talks/0521-sb-AC-management/ReCTechStack-bg.png
  • 15. Principles for creating a healthy Web Tim Berners-Lee http://www.w3.org/DesignIssues/LinkedData.html ● Use URIs as names for things ● Use HTTP URIs so people can look up those names ● When someone looks up a URI, return useful information ○ use standard representation formats to express it ● Include links to other URIs, so consumers can discover more things ○ By "consumers" we mean people or applications Why is linking important???
  • 16. Implications of a well-connected Web Links to other nodes are a "vote" of quality and/or relevance PageRank https://en.wikipedia.org/wiki/PageRank Google PageRank
  • 17. What's this "Semantic Web?" ...and where can I get one???
  • 18. “Web of meaning” Web of Data Linked Data Linking ideas... Semantic Web? meaning : ideas : data
  • 20. subject object predicate RDF: Resource Description Framework
  • 24. We're missing something... ● Check: URIs for names: S, P, O can be URIs ● Check: HTTP URIs: all of our examples are resolvable ● Now: "Return something useful" when we resolve URIs ○ How do we serialize RDF? ○ How do we retrieve RDF? Let's go to the graph...
  • 25. Source: Programming the Web http://bit.ly/1aZwr40
  • 26. "Raw" Triples Via the W3C RDF VCalidator Service: http://www.w3.org/RDF/Validator/
  • 27. N-Triples... <http://kiwitobes.com/toby.rdf#ts> <http://xmlns.com/foaf/0.1/homepage> <http://kiwitobes.com/>. <http://kiwitobes.com/toby.rdf#ts> <http://xmlns.com/foaf/0.1/nick> "kiwitobes". <http://kiwitobes.com/toby.rdf#ts> <http://xmlns.com/foaf/0.1/name> "Toby Segaran". <http://kiwitobes.com/toby.rdf#ts> <http://xmlns.com/foaf/0.1/mbox> <mailto:[email protected]>. <http://kiwitobes.com/toby.rdf#ts> <http://xmlns.com/foaf/0.1/interest> <http://semprog.com>. <http://kiwitobes.com/toby.rdf#ts> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0. 1/Person>. <http://kiwitobes.com/toby.rdf#ts> <http://xmlns.com/foaf/0.1/knows> _:jamie . <http://kiwitobes.com/toby.rdf#ts> <http://xmlns.com/foaf/0.1/knows> <http://semprog.com/people/colin>. _:jamie <http://xmlns.com/foaf/0.1/name> "Jamie Taylor". _:jamie <http://xmlns.com/foaf/0.1/mbox> <mailto:[email protected]>. _:jamie <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person>. <http://semprog.com/people/colin> <http://xmlns.com/foaf/0.1/name> "Colin Evans". <http://semprog.com/people/colin> <http://xmlns.com/foaf/0.1/mbox> <mailto:[email protected]>. <http://semprog.com/people/colin> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0. 1/Person>. <http://semprog.com> <http://www.w3.org/2000/01/rdf-schema#label> "Semantic Programming". <http://semprog.com> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Document>.
  • 28. N3... @prefix foaf: <http://xmlns.com/foaf/0.1/>. @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix semperp: <http://semprog.com/people/>. @prefix tobes: <http://kiwitobes.com/toby.rdf#>. tobes:ts a foaf:Person; foaf:homepage <http://kiwitobes.com/>; foaf:interest <http://semprog.com>; foaf:knows semperp:colin, [ a foaf:Person; foaf:mbox <mailto:[email protected]>; foaf:name "Jamie Taylor"]; foaf:mbox <mailto:[email protected]>; foaf:name "Toby Segaran"; foaf:nick "kiwitobes". <http://semprog.com> a foaf:Document; rdfs:label "Semantic Programming". semperp:colin a foaf:Person; foaf:mbox <mailto:[email protected]>; foaf:name "Colin Evans".
  • 29. RDFa... : <div xmlns:foaf="http://xmlns.com/foaf/0.1/" about="http://kiwitobes.com/toby.rdf#ts" typeof="foaf:Person"> Name: <span property="foaf:name">Toby Segaran</span><br/> Nickname: <span property="foaf:nick">kiwitobes</span><br/> Interests: <a rel="foaf:interest" href="http://semprog.org"> <span property="rdfs:label">Semantic Programming</span></a> Homepage: <a rel="foaf:homepage" href="http://kiwkitobes.com/">KiwiTobes</a><p/> Friends:<br/> <ul rel="foaf:knows"> <li about="http://semprog.com/people/colin" typeof="foaf:Person" property="foaf:name">Colin Evans</li> <li typeof="foaf:Person"> <span property="foaf:name">Jamie Taylor</span><br/> Email: <a rel="foaf:mbox" href="mailto:[email protected]"> [email protected]</a><br/> </li> </ul> </div> :
  • 30. RDF/XML... <rdf:RDF xmlns:foaf='http://xmlns.com/foaf/0.1/' xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' xmlns:rdfs='http://www.w3.org/2000/01/rdf-schema#'> <foaf:Person rdf:about="http://kiwitobes.com/toby.rdf#ts"> <foaf:name>Toby Segaran</foaf:name> <foaf:homepage rdf:resource="http://kiwitobes.com/"/> <foaf:nick>kiwitobes</foaf:nick> <foaf:mbox rdf:resource="mailto:[email protected]"/> <foaf:interest> <foaf:Document rdf:about="http://semprog.com"> <rdfs:label>Semantic Programming</rdfs:label> </foaf:Document> </foaf:interest> <foaf:knows> <foaf:Person rdf:about="http://semprog.com/people/colin"> <foaf:name>Colin Evans</foaf:name> <foaf:mbox rdf:resource="mailto:[email protected]"/> </foaf:Person> </foaf:knows> <foaf:knows> <foaf:Person> <foaf:name>Jamie Taylor</foaf:name> <foaf:mbox rdf:resource="mailto:[email protected]"/> </foaf:Person> </foaf:knows> </foaf:Person> </rdf:RDF>
  • 31. JSON-LD... { "@graph": [ { "@id": "http://kiwitobes.com/toby.rdf#ts", "@type": "http://xmlns.com/foaf/0.1/Person", "http://xmlns.com/foaf/0.1/homepage": { "@id": "http://kiwitobes.com/" }, "http://xmlns.com/foaf/0.1/interest": { "@id": "http://semprog.com" }, "http://xmlns.com/foaf/0.1/knows": [ { "@type": "http://xmlns.com/foaf/0.1/Person", "http://xmlns.com/foaf/0.1/mbox": { "@id": "mailto:[email protected]" }, "http://xmlns.com/foaf/0.1/name": "Jamie Taylor" }, { "@id": "http://semprog.com/people/colin" } ], "http://xmlns.com/foaf/0.1/mbox": { "@id": "mailto:[email protected]" }, "http://xmlns.com/foaf/0.1/name": "Toby Segaran", "http://xmlns.com/foaf/0.1/nick": "kiwitobes" }, { "@id": "http://semprog.com/people/colin", "@type": "http://xmlns.com/foaf/0.1/Person", "http://xmlns.com/foaf/0.1/mbox": { "@id": "mailto:[email protected]" }, "http://xmlns.com/foaf/0.1/name": "Colin Evans" }, { "@id": "http://semprog.com", "@type": "http://xmlns.com/foaf/0.1/Document", "http://www.w3.org/2000/01/rdf-schema#label": "Semantic Programming" } ] }
  • 32. RDF/JSON... { "http://kiwitobes.com/toby.rdf#ts": { "http://xmlns.com/foaf/0.1/nick": [ { "type": "literal", "value": "kiwitobes" } ], "http://www.w3.org/1999/02/22-rdf-syntax-ns#type": [ { "type": "uri", "value": "http://xmlns.com/foaf/0.1/Person" } ], "http://xmlns.com/foaf/0.1/interest": [ { "type": "uri", "value": "http://semprog.com" } ], "http://xmlns.com/foaf/0.1/knows": [ { "type": "uri", "value": "http://semprog.com/people/colin" }, { "type": "bnode", "value": "_:N40b366148cfc4c48a80f4e15acbd2858" } ], "http://xmlns.com/foaf/0.1/mbox": [ { "type": "uri", "value": "mailto:[email protected]" } ], "http://xmlns.com/foaf/0.1/homepage": [ { "type": "uri", "value": "http://kiwitobes.com/" } ], "http://xmlns.com/foaf/0.1/name": [ { "type": "literal", "value": "Toby Segaran" } ] }, "http://semprog.com/people/colin": { "http://xmlns.com/foaf/0.1/mbox": [ { "type": "uri", "value": "mailto:[email protected]" } ], "http://xmlns.com/foaf/0.1/name": [ { "type": "literal", "value": "Colin Evans" } ], "http://www.w3.org/1999/02/22-rdf-syntax-ns#type": [ { "type": "uri", "value": "http://xmlns.com/foaf/0.1/Person" } ] }, "_:N40b366148cfc4c48a80f4e15acbd2858": { "http://xmlns.com/foaf/0.1/mbox": [ { "type": "uri", "value": "mailto:[email protected]" } ], "http://xmlns.com/foaf/0.1/name": [ { "type": "literal", "value": "Jamie Taylor" } ], "http://www.w3.org/1999/02/22-rdf-syntax- ns#type": [ { "type": "uri", "value": "http://xmlns.com/foaf/0.1/Person" } ] }, "http://semprog.com": { "http://www.w3.org/1999/02/22-rdf-syntax- ns#type": [ { "type": "uri", "value": "http://xmlns.com/foaf/0. 1/Document" } ], "http://www.w3.org/2000/01/rdf-schema#label": [ { "type": "literal", "value": "Semantic Programming" } ] } }
  • 33. RDF Serialization: Summary ● N-Triples: Verbose, "pedagogical" ● N3: Concise, in common use; ● RDFa: Commonly used for embedded RDF ● RDF/XML: Some use in government & "enterprise" ● JSON-LD: Fast-rising LD standard ● RDF/JSON: Older convention for LD applications
  • 34. Things we still haven't discussed... ● How to retrieve this "linked data" of which I speak ● How (and where?) to query RDF "graphs" ● How to use LD in applications ● How to create visualizations & "mashups" Also: ● How to create and publish linked data...
  • 35. Consuming Linked Data ● Querying RDF: SPARQL ● Endpoints and triple stores ● "Mashing" data in the query ● Mashing data in the application
  • 36. Anatomy of a Mashup Demo: http://logd.tw.rpi.edu/demo/building-logd-visualizations/agi-per-capita-v2.html (use Firefox)
  • 37. Deep Dive: LOGD Mashup Tutorial 1. Choose data (two datasets from Data.gov) ○ "State Library Agency Survey: Fiscal Year 2006" ○ "Tax Year 2007 County Income Data" 2. Define queries to retrieve desired results from endpoint ○ http://logd.tw.rpi.edu/demo/building-logd-visualizations/mashup-353-population-1356-agi.sparql ○ Submit this URI to http://logd.tw.rpi.edu/sparql 3. Define basic HTML layout 4. Insert visualization code (e.g. Google visualization) 5. Pass static data ○ http://logd.tw.rpi.edu/demo/building-logd-visualizations/mashup-353-population-1356-agi.js 6. Revise to pass dynamic data from live SPARQL queries http://logd.tw.rpi.edu/tutorial/building_logd_visualizations
  • 39. Converted RDF on TWC LOGD Portal
  • 40. SPARQL: pattern matching over RDF graphs
  • 47. LOGD Tutorial SPARQL Query # this query returns the agi and population data from two data.gov datasets SELECT distinct ?state_abbv ?agi ?population WHERE { GRAPH <http://logd.tw.rpi.edu/source/data-gov/dataset/353/version/1st-anniversary>{ ?s1 <http://logd.tw.rpi.edu/source/data-gov/dataset/353/vocab/raw/popu_st>?population. ?s1 <http://logd.tw.rpi.edu/source/data-gov/dataset/353/vocab/raw/pub_fips> ?state_fipscode . } GRAPH <http://logd.tw.rpi.edu/source/data-gov/dataset/1356/version/2009-Dec-03> { ?s2 <http://logd.tw.rpi.edu/source/data-gov/dataset/1356/vocab/raw/state_abbrv>?state_abbv . ?s2 <http://logd.tw.rpi.edu/source/data-gov/dataset/1356/vocab/raw/county_code> "000" . ?s2 <http://logd.tw.rpi.edu/source/data-gov/dataset/1356/vocab/raw/agi>?agi. ?s2 <http://logd.tw.rpi.edu/source/data-gov/dataset/1356/vocab/raw/state_code> ?state_fipscode . } } order by ?state_fipscode
  • 49. SPARQL Query results (JSON) google.visualization.Query.setResponse({version:0.6,status:'ok',reqId:0,table:{cols:[{id:'state_abbv', label:'state_abbv',type:'string'},{id:'agi',label:'agi',type:'number'},{id:'population',label:'population', type:'number'}],rows:[{c:[{v:'AL'},{v:92162773},{v:4599030}]},{c:[{v:'AK'},{v:17312636},{v:670053}]},{c: [{v:'AZ'},{v:134442007},{v:6166318}]},{c:[{v:'AR'},{v:49783294},{v:2810872}]},{c:[{v:'CA'},{v:913619942},{v: 36457549}]},{c:[{v:'CO'},{v:128175529},{v:4753377}]},{c:[{v:'CT'},{v:122697142},{v:3504809}]},{c:[{v:'DE'}, {v:22983204},{v:853476}]},{c:[{v:'DC'},{v:18177370},{v:581530}]},{c:[{v:'FL'},{v:429785960},{v:18089888}]}, {c:[{v:'GA'},{v:199864840},{v:9363941}]},{c:[{v:'HI'},{v:30592983},{v:1285498}]},{c:[{v:'ID'},{v:30292717}, {v:1466465}]},{c:[{v:'IL'},{v:339217881},{v:12831970}]},{c:[{v:'IN'},{v:140616570},{v:6313520}]},{c: [{v:'IA'},{v:68946837},{v:2982085}]},{c:[{v:'KS'},{v:65216515},{v:2764075}]},{c:[{v:'KY'},{v:81721206},{v: 4206074}]},{c:[{v:'LA'},{v:84029967},{v:4287768}]},{c:[{v:'ME'},{v:28954363},{v:1321574}]},{c:[{v:'MD'},{v: 168647138},{v:5615727}]},{c:[{v:'MA'},{v:202226349},{v:6437193}]},{c:[{v:'MI'},{v:227233854},{v:10095643}]}, {c:[{v:'MN'},{v:143482070},{v:5167101}]},{c:[{v:'MS'},{v:47387966},{v:2910540}]},{c:[{v:'MO'},{v:131166510}, {v:5842713}]},{c:[{v:'MT'},{v:20045504},{v:944632}]},{c:[{v:'NE'},{v:41569440},{v:1768331}]},{c:[{v:'NV'}, {v:65272642},{v:2495529}]},{c:[{v:'NH'},{v:38175000},{v:1314895}]},{c:[{v:'NJ'},{v:283024874},{v:8724560}]}, {c:[{v:'NM'},{v:38144029},{v:1954599}]},{c:[{v:'NY'},{v:513598458},{v:19306183}]},{c:[{v:'NC'},{v: 195374554},{v:8856505}]},{c:[{v:'ND'},{v:14923738},{v:635867}]},{c:[{v:'OH'},{v:259099675},{v:11478006}]}, {c:[{v:'OK'},{v:70394493},{v:3579212}]},{c:[{v:'OR'},{v:85591882},{v:3700758}]},{c:[{v:'PA'},{v:313289892}, {v:12440621}]},{c:[{v:'RI'},{v:26532233},{v:1067610}]},{c:[{v:'SC'},{v:88615194},{v:4321249}]},{c:[{v:'SD'}, {v:17825580},{v:781919}]},{c:[{v:'TN'},{v:126270760},{v:6038803}]},{c:[{v:'TX'},{v:504386602},{v: 23507783}]},{c:[{v:'UT'},{v:55426179},{v:2550063}]},{c:[{v:'VT'},{v:15246152},{v:623908}]},{c:[{v:'VA'},{v: 217677476},{v:7642884}]},{c:[{v:'WA'},{v:175730868},{v:6395798}]},{c:[{v:'WV'},{v:32243697},{v:1818470}]}, {c:[{v:'WI'},{v:140516394},{v:5556506}]},{c:[{v:'WY'},{v:15216840},{v:515004}]}]}}) http://logd.tw.rpi.edu/demo/building-logd-visualizations/mashup-353-population-1356-agi.js
  • 50. SPARQL Query results (JSON) google.visualization.Query.setResponse({version:0.6,status:'ok',reqId:0,table:{cols:[{id:'state_abbv', label:'state_abbv',type:'string'},{id:'agi',label:'agi',type:'number'},{id:'population', label:'population',type:'number'}],rows:[{c:[{v:'AL'},{v:92162773},{v:4599030}]},{c:[{v:'AK'},{v: 17312636},{v:670053}]},{c:[{v:'AZ'},{v:134442007},{v:6166318}]},{c:[{v:'AR'},{v:49783294},{v:2810872}]},{c: [{v:'CA'},{v:913619942},{v:36457549}]},{c:[{v:'CO'},{v:128175529},{v:4753377}]},{c:[{v:'CT'},{v:122697142}, {v:3504809}]},{c:[{v:'DE'},{v:22983204},{v:853476}]},{c:[{v:'DC'},{v:18177370},{v:581530}]},{c:[{v:'FL'},{v: 429785960},{v:18089888}]},{c:[{v:'GA'},{v:199864840},{v:9363941}]},{c:[{v:'HI'},{v:30592983},{v:1285498}]}, {c:[{v:'ID'},{v:30292717},{v:1466465}]},{c:[{v:'IL'},{v:339217881},{v:12831970}]},{c:[{v:'IN'},{v: 140616570},{v:6313520}]},{c:[{v:'IA'},{v:68946837},{v:2982085}]},{c:[{v:'KS'},{v:65216515},{v:2764075}]},{c: [{v:'KY'},{v:81721206},{v:4206074}]},{c:[{v:'LA'},{v:84029967},{v:4287768}]},{c:[{v:'ME'},{v:28954363},{v: 1321574}]},{c:[{v:'MD'},{v:168647138},{v:5615727}]},{c:[{v:'MA'},{v:202226349},{v:6437193}]},{c:[{v:'MI'}, {v:227233854},{v:10095643}]},{c:[{v:'MN'},{v:143482070},{v:5167101}]},{c:[{v:'MS'},{v:47387966},{v: 2910540}]},{c:[{v:'MO'},{v:131166510},{v:5842713}]},{c:[{v:'MT'},{v:20045504},{v:944632}]},{c:[{v:'NE'},{v: 41569440},{v:1768331}]},{c:[{v:'NV'},{v:65272642},{v:2495529}]},{c:[{v:'NH'},{v:38175000},{v:1314895}]},{c: [{v:'NJ'},{v:283024874},{v:8724560}]},{c:[{v:'NM'},{v:38144029},{v:1954599}]},{c:[{v:'NY'},{v:513598458},{v: 19306183}]},{c:[{v:'NC'},{v:195374554},{v:8856505}]},{c:[{v:'ND'},{v:14923738},{v:635867}]},{c:[{v:'OH'},{v: 259099675},{v:11478006}]},{c:[{v:'OK'},{v:70394493},{v:3579212}]},{c:[{v:'OR'},{v:85591882},{v:3700758}]}, {c:[{v:'PA'},{v:313289892},{v:12440621}]},{c:[{v:'RI'},{v:26532233},{v:1067610}]},{c:[{v:'SC'},{v:88615194}, {v:4321249}]},{c:[{v:'SD'},{v:17825580},{v:781919}]},{c:[{v:'TN'},{v:126270760},{v:6038803}]},{c:[{v:'TX'}, {v:504386602},{v:23507783}]},{c:[{v:'UT'},{v:55426179},{v:2550063}]},{c:[{v:'VT'},{v:15246152},{v:623908}]}, {c:[{v:'VA'},{v:217677476},{v:7642884}]},{c:[{v:'WA'},{v:175730868},{v:6395798}]},{c:[{v:'WV'},{v:32243697}, {v:1818470}]},{c:[{v:'WI'},{v:140516394},{v:5556506}]},{c:[{v:'WY'},{v:15216840},{v:515004}]}]}}) http://logd.tw.rpi.edu/demo/building-logd-visualizations/mashup-353-population-1356-agi.js
  • 51. Defining HTML Layout <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>AGI per Capita Map</title> </head> <body> <div>AGI per Capita Map: average adjusted gross income per person in dollar amount in US states.</div> <div id='map_canvas'>Loading Map ...</div> </body> </html> http://logd.tw.rpi.edu/tutorial/building_logd_visualizations
  • 52. Visualization Code... 1. Load the appropriate Google Visualization API packages (in this case, the GeoMap package). 2. Define a callback function for loading visualization code, which is called upon the loading of the HTML page. 3. Obtain data from a given source to pass to our GeoMap instance. The Google Visualization API is designed to accept data in the form of specially-formatted JSON (represented by a URI) which can then be fed to a JSON processing function. 4. Following a call to the JSON processor, verify that it successfully processed the passed file. 5. Get back a response from the query processor, containing the data from the JSON file. 6. Define a data table to store the response data in. This process starts by defining header entries of the form TABLE. addColumn(DATATYPE, NAME). 7. For each entry in the response, create a new data table row for the corresponding data. 8. Define a configuration for the GeoMap instance to be visualized, containing information such as resolution. 9. Define the GeoMap instance in the HTML div with id='map_canvas', using the configuration from Step 8 and data table from Step 7. http://logd.tw.rpi.edu/tutorial/building_logd_visualizations
  • 53. Visualization Code... 1. Load the appropriate Google Visualization API packages (in this case, the GeoMap package). 2. Define a callback function for loading visualization code, which is called upon the loading of the HTML page. 3. Obtain data from a given source to pass to our GeoMap instance. The Google Visualization API is designed to accept data in the form of specially-formatted JSON (represented by a URI) which can then be fed to a JSON processing function. 4. Following a call to the JSON processor, verify that it successfully processed the passed file. 5. Get back a response from the query processor, containing the data from the JSON file. 6. Define a data table to store the response data in. This process starts by defining header entries of the form TABLE. addColumn(DATATYPE, NAME). 7. For each entry in the response, create a new data table row for the corresponding data. 8. Define a configuration for the GeoMap instance to be visualized, containing information such as resolution. 9. Define the GeoMap instance in the HTML div with id='map_canvas', using the configuration from Step 8 and data table from Step 7. http://logd.tw.rpi.edu/tutorial/building_logd_visualizations
  • 54. Dynamic Visualizations Loading data using SPARQL queries //load data using SPARQL query var sparqlproxy = "http://logd.tw.rpi.edu/ws/sparqlproxy.php"; var queryloc = "http://logd.tw.rpi.edu/demo/building-logd-visualizations/mashup-353- population-1356-agi.sparql"; var service = "http://logd.tw.rpi.edu/sparql"; var queryurl = sparqlproxy + "?" + "output=gvds" + "&service-uri=" + encodeURIComponent(service) + "&query-uri=" + encodeURIComponent(queryloc) ;
  • 55. Next: Building the Web of Data ● Converting datasets to RDF ● Hosting: Triplestores & endpoints ● Enterprise use cases ● Advanced techniques ● Web Science...
  • 56. Part II: Building the Web of Data
  • 57. Part II: Building the Web of Data 1. Review: the Web of Data 2. Publishing the Web of Data 3. Engineering the Web of Data in the Enterprise 4. Enterprise Applications of Semantic Technologies 5. Advanced "Semantic Web" concepts 6. Web Science: Observing and (re)Engineering the Web
  • 58. First Principles of the Web... ● A standard system for identifying resources ● Standard formats for representing resources ● A standard protocol for exchanging resources Relevant core standards: ● URIs (URLs): Uniform Resource Identifiers ● HTML: Hypertext Markup Language ● HTTP: Hypertext Transfer Protocol Review: Web Architecture
  • 59. Review: Linked Data Principles Tim Berners-Lee http://www.w3.org/DesignIssues/LinkedData.html ● Use URIs as names for things ● Use HTTP URIs so people can look up those names ● When someone looks up a URI, return useful information ○ use standard representation formats to express it ● Include links to other URIs, so consumers can discover more things ○ By "consumers" we mean people or applications
  • 60. Now: Publishing the Web of Data Recall our triples from previous lecture...
  • 61. Publishing the Web of Data Recall our triples from previous lecture... To be useful, this data must be loaded in a triple store and published via a web-accessible SPARQL endpoint
  • 62. Industrial-strength Triple stores 1. AllegroGraph (1+Trillion) 2. OpenLink Virtuoso v6.1 - 15.4B+ explicit; uncounted virtual/inferred 3. BigOWLIM (12B explicit, 20B total); 100,000 queries per $1 4. Garlik 4store (15B) 5. Bigdata(R) (12.7B) 6. YARS2 (7B) 7. Jena TDB (1.7B) 8. Jena SDB (650M) 9. Mulgara (500M) 10. RDF gateway (262M) 11. Jena with PostgreSQL (200M) 12. Kowari (160M) 13. 3store with MySQL 3 (100M) 14. Sesame (70M) 15. Others who claim to go big TWC uses Virtuoso Open Source edition
  • 63. Industrial-strength Triple stores 1. AllegroGraph (1+Trillion) 2. OpenLink Virtuoso v6.1 - 15.4B+ explicit; uncounted virtual/inferred 3. BigOWLIM (12B explicit, 20B total); 100,000 queries per $1 4. Garlik 4store (15B) 5. Bigdata(R) (12.7B) 6. YARS2 (7B) 7. Jena TDB (1.7B) 8. Jena SDB (650M) 9. Mulgara (500M) 10. RDF gateway (262M) 11. Jena with PostgreSQL (200M) 12. Kowari (160M) 13. 3store with MySQL 3 (100M) 14. Sesame (70M) 15. Others who claim to go big You can install Apache Jena yourselves!
  • 65. Publishing: RDBMS to RDF ● Advantage: Leveraging "legacy" sources ● Challenge: Complexity... ● Example: D2RQ Platform ○ D2RQ Mapping Language ○ D2RQ Engine ○ D2R Server See also: http://d2rq.org/
  • 66. Publishing: Linked Data API Motivations: ● SPARQL, RDF have high learning curves ● RDF support in the common web development tool stacks is scarce ● Solution is the Linked Data API Advantages: ● Easy to use web API on linked data ● Allows publisher to provide URIs for lists of things ● Allows users to get back the data as JSON, XML, or RDF ● Easy to filter data using simple URL query parameters Makes it easy to create web applications over the published data using standard tools https://code.google.com/p/linked-data-api/wiki/Specification
  • 67. Linked Data API Example UK Bathing Water Data Explorer Live: http://environment.data.gov.uk/lab/bwq-web Details: http://www.epimorphics.com/web/projects/bathing-water-quality
  • 68. Architecture of Linked Data Applications ● The Crawling Pattern ● The On-The-Fly Dereferencing Pattern ● The Query Federation Pattern The architecture of a Linked Data application depends on its driving use case
  • 69. The Crawling Pattern ● Applications "crawl" the Web of Data in advance by traversing RDF links ● Integrate and cleanse discovered data ● Provide higher layers of the application with an integrated view of the original data ● Mimics the architecture of classical Web search engines like Google and Yahoo ● Suitable for implementing applications on top of an open, growing set of sources ○ new data sources are discovered by the crawler at run-time. ● Separates the tasks of building up the cache and using cache later ○ enables applications to execute complex queries with reasonable performance over large amounts of data Disadvantages: ● Data is replicated ● Applications may work with stale data; crawler only re-crawls sources at certain intervals The crawling pattern is implemented by Linked Data search engines
  • 70. "Crawling Pattern" in the Wild Google Rich Snippets
  • 71. The On-The-Fly Dereferencing Pattern ● URIs are dereferenced and links are followed the moment the application requires the data ● Applications never process stale data Disadvantages: ● More complex operations are very slow as they might involve dereferencing thousands of URIs in the background ● Architectures have been proposed for answering complex queries over the Web of Data by relying on on-the-fly dereferencing pattern ● Results show that data currency and a very high degree of completeness are achieved at the price of very slow query execution The crawling pattern is implemented by Linked Data browsers
  • 72. "On-the-fly" examples ● Our previous example (dynamic version) ● Tabulator, Marbles
  • 73. The Query Federation Pattern ● Relies on sending complex queries (or parts) directly to a fixed set of data sources. ● Useful if data sources provide SPARQL endpoints in addition to serving their data on the Web via dereferenceable URIs ● Enables applications to work with current data without needing to replicate complete data sources locally Disadvantages: ● Finding performant query execution plans for join queries over larger numbers of data sources is complex (i.e. a research topic) ● Query performance slows down significantly when number of data sources grows ● Query federation pattern should only be used in situations where the number of data sources is known to be small Applications could follow links between data sources, examine voiD descriptions provided by these data sources and then include data sources which provide SPARQL endpoints into their list of targets for federated queries http://linkeddatabook.com/editions/1.0/
  • 74. Query Federation Example SELECT ?birthDate ?spouseName ?movieTitle ? movieDate { { SERVICE <http://dbpedia.org/sparql> { SELECT ?birthDate ?spouseName WHERE { ?actor rdfs:label "Arnold Schwarzenegger" @en ; dbpo:birthDate ?birthDate ; dbpo:spouse ?spouseURI . ?spouseURI rdfs:label ?spouseName . FILTER ( lang(?spouseName) = "en" ) } } } { SERVICE <http://data.linkedmdb.org/sparql> { SELECT ?actor ?movieTitle ?movieDate WHERE { ?actor imdb:actor_name "Arnold Schwarzenegger". ?movie imdb:actor ?actor ; dcterms:title ?movieTitle ; dcterms:date ?movieDate . } } } } Application Code Federated SPARQL Service DBPedia.org LinkedMDB.org e.g. Jena ARQ See also bobdc.org http://bit.ly/HLdQ4S
  • 76. Enterprise Use Cases http://www.w3.org/2001/sw/sweo/public/UseCases/ "Enterprise Energy Intelligence" (DERI) "A Semantic Web Content Repository for Clinical Research" (Cleveland Clinic)
  • 77. Cleveland Clinic Use Case ● Improve the Clinic’s ability to use patient data for generating new knowledge to improve future patient care through outcomes-based and longitudinal clinical research. ● Leverage expressiveness and versatility of formats to provide individual patients an appropriate terminology and accessible view of summary data. ● Over 4 years, Cleveland Clinic has developed a representational methodology for bridging data collection, document management, and knowledge representation. ● The result is a unified content repository called SemanticDB. http://www.w3.org/2001/sw/sweo/public/UseCases/ClevelandClinic/
  • 78. Cleveland Clinic Use Case ● SemanticDB internally deployed for production on top of an open source XML & RDF content repository and Firefox (with extensions). ● Methodology realized through a core set of terms that facilitate creation of a domain vocabulary (or domain model) ○ instances of the vocabulary managed automatically by the system. ● Patient records available as both uniform, structured markup and RDF. ● Coordinated use of both representation languages enables a variety of operations on patient record: ○ form-based data entry, transformation to reporting formats, document validation, targeted inference, and querying ○ Operations can be dispatched on the patient record documents and RDF graphs over a uniform set of interfaces. http://www.w3.org/2001/sw/sweo/public/UseCases/ClevelandClinic/
  • 83. DERI "Enterprise Energy Intelligence" http://dgsit.deri.ie/?q=node/15 and http://slidesha.re/195Gnrr
  • 89. More about "Enterprise Linked Data" Part I: Why Link Enterprise Data? ● Semantic Web and the Linked Data Enterprise ● The Role of Community-Driven Data Curation for Enterprises Part II: Approval and Support of Linked Data Projects ● Preparing for a Linked Data Enterprise ● Selling and Building Linked Data: Drive Value and Gain Momentum Part III: Techniques for Linking Enterprise Data ● Enhancing Enterprise 2.0 Ecosystems Using Semantic Web and Linked Data Technologies ● Linking XBRL Financial Data ● Scalable Reasoning Techniques for Semantic Enterprise Data ● Reliable and Persistent Identification of Linked Data Elements Part IV: Success Stories ● Linked Data for Fighting Global Hunger ● Enterprise Linked Data as Core Business Infrastructure ● Standardizing Legal Content with OWL and RDF ● A Role for Semantic Web Technologies in Patient Record Data Collection ● Use of Semantic Web technologies on the BBC Web Sites http://3roundstones.com/led_book/led-contents.html
  • 90. ● Vocabulary design/RDFS ● Knowledge Organization ● Ontology design ● Provenance ● Inference Advanced Concepts
  • 91. ● Vocabulary design/RDFS ● Knowledge Organization ● Ontology design ● Provenance ● Inference Advanced Concepts For a poetic (and humorous!) consideration of the evolution of the "Semantic Layer Cake" see: Jim Hendler, "My Take on the Semantic Web Layer Cake." http://bit.ly/195L70i
  • 93. Inference: Discovering New Relationships On the Semantic Web, data is modeled as a set of (named) relationships between resources ● Inference means using automatic procedures to generate new relationships ○ based on the data... ○ ...and some additional information in the form of a vocabulary or a set of rules ● The new relationships may explicitly added to the set of data, or may be returned at query time (implementation issue) ● The source of additional information is defined through vocabularies or rule sets ● Both approaches draw upon knowledge representation techniques ○ Ontologies provide classification methods, putting an emphasis on defining 'classes', 'subclasses', on how individual resources can be associated to such classes, and characterizing the relationships among classes and their instances ○ Rules define mechanisms for discovering and generating new relationships based on existing ones, much like logic programs (Prolog) ● In the Semantic Web toolkit, RDFS, OWL, or SKOS are used for defining ontologies ○ RIF covers rule based approaches
  • 94. Vocabulary Design: W3C RDFS (1) ● RDF Vocabulary Description Language ● RDF has no mechanism for: ○ describing properties ○ describing the relationships between properties and other resources ● RDF Schema defines classes and properties for describing classes, properties and other resources ● RDF Schema vocabulary descriptions are written in RDF http://www.w3.org/TR/2004/REC-rdf-schema-20040210/
  • 95. RDF Schema: Classes ● rdfs:Resource ...is the class of everything ● rdfs:Class ...declares a resource as a class for other resources ● rdfs:Literal ...literal values such as strings and integers ● rdfs:Datatype ...the class of datatypes ● rdf:XMLLiteral ...the class of XML literal values ● rdf:Property ...the class of properties http://www.w3.org/TR/2004/REC-rdf-schema-20040210/
  • 96. RDF Schema: Properties ● rdfs:domain ...declares the class of the subject in a triple whose second component is the predicate. ● rdfs:range ...declares the class or datatype of the object in a triple whose second part is the predicate ○ ex:employer rdfs:domain foaf:Person ○ ex:employer rdfs:range foaf:Organization ● rdf:type ...state that resource is an instance of a class ● rdfs:subClassOf ...allows to declare hierarchies of classes. ○ e.g. "Every Person is an Agent": foaf:Person rdfs:subClassOf foaf:Agent ● rdfs:subPropertyOf ...states that all resources related by one property are also related by another ● rdfs:label ...used to provide a human-readable version of a resource's name ● rdfs:comment ...provides a human-readable description of a resource ● rdfs:seeAlso ...indicates a resource that might provide additional information about the subject resource. ● rdfs:isDefinedBy ...indicates a resource defining the subject resource http://www.w3.org/TR/2004/REC-rdf-schema-20040210/
  • 97. Knowledge Organization 1: W3C OWL ● Web Ontology Language ● RDFS too weak to describe resources in sufficient detail ○ No localised range and domain constraints ■ Can’t say that the range of hasChild is person when applied to persons and elephant when applied to elephants ○ No existence/cardinality constraints ■ Can’t say that all instances of person have a mother that is also a person, or that persons have exactly 2 parents ○ No transitive, inverse or symmetrical properties ■ Can’t say that isPartOf is a transitive property, that hasPart is the inverse of isPartOf or that touches is symmetrical ● Difficult to provide reasoning support ○ No “native” reasoners for non-standard semantics http://www.w3.org/2007/OWL/wiki/OWL_Working_Group or http://bit.ly/195WANj
  • 98. Knowledge Organization 1: W3C OWL Desirable features identified for Web Ontology Language: ● Extends existing Web standards ○ Such as XML, RDF, RDFS ● Easy to understand and use ○ Should be based on familiar KR* idioms ● Formally specified ● Of “adequate” expressive power ● Possible to provide automated reasoning support KR* = knowledge representation http://www.w3.org/2007/OWL/wiki/OWL_Working_Group or http://bit.ly/195WANj or http://bit.ly/1960964
  • 99. OWL Tools: Protege-OWL Editor http://protege.stanford.edu/overview/protege-owl.html
  • 100. Knowledge Organization 2: W3C SKOS ● Simple Knowledge Organization System ● An application of RDFS and OWL ● Provides a way to represent controlled vocabularies, taxonomies and thesauri ○ controlled vocabulary: a list of terms which a community or organization has agreed upon ○ taxonomy: a controlled vocabulary organized in a hierarchy ○ thesaurus: a taxonomy with more information about each concept including preferred and alternative terms. ○ A thesaurus may also contain relationships to related concepts ● SKOS is an OWL ontology; it can be written out in any RDF syntax http://www.w3.org/2004/02/skos/ or http://slidesha.re/1etWDue or http://bit.ly/1etYLlE
  • 101. Provenance: The W3C PROV Model ● A set of W3C recommendations and notes on modelling provenance ● PROV-O is the "core..." http://www.w3.org/TR/prov-primer/
  • 102. Provenance in a Nutshell ● prov:Entity is a physical, digital, conceptual, or other kind of thing with some fixed aspects; entities may be real or imaginary ● prov:Activity is something that occurs over a period of time and acts upon or with entities; it may include consuming, processing, transforming, modifying, relocating, using, or generating entities ● prov:Agent is something that bears some form of responsibility for an activity taking place, for the existence of an entity, or for another agent's activity These three classes provide a basis for the rest of PROV-O http://www.w3.org/TR/prov-primer/ or http://www.provbook.org/
  • 103. Inference and W3C RIF ● Production Rules ○ Analogous to instruction in a program: If a certain condition holds, then some action is carried out ○ Example: "If a customer has flown more than 100,000 miles, then upgrade him to Gold Member status." ● Declarative Rules ○ Stating a fact about the world ○ Understood as sentences of the form "If P, then Q" ○ Example: "If a person is currently president of the United States of America, then his or her current residence is the White House." ● There are many rule systems, esp. in the expert systems domain ● The W3C Rule Interchange Format is an interchange format between existing rule systems http://www.w3.org/TR/2013/NOTE-rif-primer-20130205/
  • 105. The Future (from the past)
  • 106. What is Web Science? ● Positions the World Wide Web as an object of scientific study unto itself ● Recognizes the Web as a transformational, disruptive technology ● Its practitioners focus on understanding the Web... ○ ...its components, facets and characteristics ● The Web Science Method: “the process of designing things in a very large space..."
  • 107. What does Web Science ask? ● What processes have driven the Web’s growth? ○ Will they persist? ● How does large-scale structure emerge from a simple set of protocols? ● How does the Web function as a socio-technical system? ● What drives the viral uptake of certain Web phenomena? Bottom line: What might fragment the Web?
  • 108. Clare Hooper, et.al. http://bit.ly/R813sC
  • 109. To Probe Further... ● TWC Linking Open Government Data Portal http://logd.tw.rpi.edu ○ Esp: Linking Open Government Data Tutorials: http://logd.tw.rpi.edu/tutorials ● Heath & Bizer, "Linked Data." http://bit.ly/1dxKxNe ● Cambridge Semantics, Semantic University http://bit.ly/1cvy9Mv ● David Wood, "Intro to Linked Data: Modelling" http://slidesha.re/HUmihT ● David Wood, "Intro to Linked Data: Context" http://slidesha.re/1fGhQPv ● David Wood, "Intro to Linked Data: SPARQL" http://slidesha.re/1eUd8Qz ● Rob Stiles, "Linked Data, RDF and SPARQL" http://slidesha.re/17xVIqq ● Ivan Herman, "An Introduction to Semantic Web and Linked Data" http://slidesha.re/1aHREyv ● "Linked Data for the Enterprise" http://slidesha.re/1cvyqyS ● "Semantic Enterprise 2.0" http://slidesha.re/19vkl5u ● "Smart Enterprises" http://slidesha.re/1aXlncX ● "Linked data management" http://slidesha.re/19pHmTw GOOD! ● "Enterprise Data Meets Web Data" http://slidesha.re/1ifx9AU ● DERI, "Enterprise Energy Management using a Linked Dataspace for Energy Intelligence" http://slidesha.re/195Gnrr ● "Enhancement and Integration of Corporate Social Software Using the Semantic Web" http://bit.ly/1fGi7BW ● "Enabling Semantic Web technologies in the Enterprise 2.0 environment" http://slidesha.re/19vkl5u ● Workshop on Enterprise Semantic Web http://www.wasabi-ws.org/ esp: http://bit.ly/18zMhlp ● Best Buy examples (Jay Myers) http://www.slideshare.net/jaymmyers ● "Querying Semantic Web Databases" http://bit.ly/17vyXy9 ● From "Big Data" to "Smart Data" (e.g. Ontotext example) http://slidesha.re/16s1iI7 ● "How to publish linked data on the web" http://bit.ly/1cvzcfe ○ Supersceded by: " Linked Data: Evolving the Web into a Global Data Space" http://linkeddatabook.com/editions/1. 0/ ● "Practical Cross-dataset Queries on the Web of Data" http://slidesha.re/1ifxsvy