A Semantic Web Primer
™
In partnership with
History & Perspective
“We might hope to see the
finances of the Union as clear
and intelligible as a
merchant's books, so that
every member of Congress
and every man of any mind
in the Union should be able
to comprehend them, to
investigate abuses, and
consequently to control
them.”
Thomas Jefferson, 1802
History & Perspective
Web 1.0
 Read Only
 Visual
 Human Presentation
Web 2.0
 Interactive
 Social
 Applications
Web 3.0
“Web of Data”
Graph Theory
Problem: Traverse each bridge once and only once in touring the city
Solution: Zero or two nodes of odd degree
Konigsburg, Prussia, 1735
Web 3.0 – The GGG
RelFinder SpaceTimeFacebook Graph Search Diseasome
Linked Life Data iDrove.itEdamam
TBL 2009
Google Gets It
Google Knowledge Graph Geosearch Demo
The Knowledge Graph
The Resource Description Framework
(RDF)
http://www.example.org/~joe/contact.rdf#joesmith
http://www.example.org/~joe/ “Object” resource (joe's home page)
“Subject” resource (joesmith in RDF)
Can be a literal...
http://xmlns.com/foaf/0.1/homepage “Predicate” resource (FOAF relationship)
URL URN
URI
Story in RDBMS
ID Name Type Legs Fur
001 Harrison Human 2 No
002 Layla Dog 4 Yes
003 MyBall Ball N/A N/A
Concept: Harrison has a dog friend named Layla whose favorite activity is chasing a ball
ObjectTable
ID1 ID2
001 002
002 001
FriendshipTable
Type1 Type2
Human Mammal
Dog Mammal
Ball Toy
ID1 ID2
002 003
SubclassTable
FavActivityTable
Tables, Not Objects
Story in OODBMS
Concept: Harrison has a dog friend named Layla whose favorite activity is chasing a ball
DogDog
HumanHuman
BallBall
ToyToy
001001 002002
HarrisonHarrison LaylaLayla22 44 TrueTrueFalseFalse
003003
MammalMammal
MyBallMyBall
foaf:friend
isA isA
ofType ofType
hasName
numLegs
hasFur hasName
numLegs
hasFur
hasName
ofType
isA
chases
Class
(Concept)
Sub-Class
Object
(Individual)
Attribute
(Property)
Atomic (axiomatic) Concepts
ID Name Type Legs Fur
001 Harrison Human 2 No
002 Layla Dog 4 Yes
003 MyBall Ball N/A N/A
ID1 ID2
002 003
Concept (dog) Instance (Layla)
Favorite Activity
Name
RDBMS
OODBMS
Reuse
Time to Implementation Diminishes with each Phase &
Domain
WWW Contrast
Quantitative
B-Tree Index: M x O(log2
N)
Graph: 1 x O(log2
N) – plus pointer traversalRef: Michel Domenjoud, blog.octo.com
Qualitative
Horizontal Scale (Complexity)
VerticalScale(Size)
Oliver, Andrew C. "Which Freaking Database Should I Use?" InfoWorld. InfoWorld, 02 Aug. 2012. Web. 07 Oct. 2013.
Gonzalez, Rob. "Two Kinds of Big Data." Semanticweb.com. Semanticweb.com, 06 Sept. 2011. Web. 07 Oct. 2013.
GGG
The “Ontology” (Vocabulary)
TaxonomyTaxonomy: Organizational hierarchy based upon similarities in properties
OntologyOntology: Set of representational primitives (classes, attributes & relationships)
with which to model a domain of knowledge or discourse*
HumanHuman
MammalMammal
Novel Stringtitle
Human Authorprofession
Novel Authorcreator NameName
NovelNovel
DocDoc
AuthorAuthor NameName
Human Stringname
Key pointKey point: The triple-store's construction enables the extraction of
meaning through inference (description logic), ie, “Smart Data”
*Encyclopedia of Database Systems, L. Liu et al, 2009
Summary
Aspect Web 2.0 Web 3.0
Unification Join Built-in
Retrieval Keyword indexing Pointer traversal
Query Keyword matching Inference
Citizens Documents Data
Polysemy & Synonymy Yes No
Construct Table Triple (how)
Descriptor URL - where URI = URL + URN (what)
Data Aggregated Connected
Flexibility Low High
Dynamic
Relationships
Meaning
Ian Robinson, Neo4J
Ian Robinson
Basic (very) Architecture
Unstructured
Data
Structure-er
(ETL)
Triple-store (OOBDMS)
Inference and rendering engine
SPARQL
Web Server
Open Government Use Case
Database
Search
ORM
DCAT Data Portal
.NET (wrapper)
PHPPylons Framework
Publish
Aggregate
Project
2011
2014
Consume
Not a new idea...
LOD Cloud, October 2007
But It Is Growing
Ref: lod-cloud.net
Community
Ontologies
Vendors
Practitioners
Consortia
Government
Community
Conferences
Testimonials
This inference capability makes
both the journalist tagging and the
triple store powered SPARQL
queries simpler and indeed quicker
than a traditional SQL approach.
Dynamic aggregations based on
inferred statements increase the
quality and breadth of content
across the site. The RDF triple
approach also facilitates agile
modeling, whereas traditional
relational schema modeling is less
flexible and also increases query
complexity.
-BBC
Because there haven't been graph models in the
market-leading relational databases, the graph
approach has been mostly overlooked by
corporate users. But it shouldn't be ignored
anymore, he added: Graphs provide "a good
representation of the data structure with which
you're dealing."
-Robin Bloor, the Bloor Group, July 2013
We believe that Graph Search is part of a
trend that is much bigger than Facebook,
and more widespread than search.
Facebook is tapping into a fundamentally
new way to exploit the information that
exists in all the world’s databases. In this
post, we will look at the Facebook
announcement from a different angle, that
of connected data: a growing trend that is
on the verge of changing how companies
large and small understand their data.
-Andreas Kollegger, Jan 2013
The Web as we know it today will not be
the Web as we know it tomorrow. The
Web of today is oriented towards the
universal accessibility of files (e.g. web
pages, images). The Web of today can
be thought of as a large-scale,
distributed file system. The Web of
tomorrow will encode any datum (e.g.
strings, integers, dates). The Web of
tomorrow can be thought of as a
large-scale, distributed database.
-Marko Rodriguez, Los Alamos, 2009
The construction and structure of graphs or networks
is the key to understanding the complex world
around us. -Albert Laszlo, “Linked: The New Science
of Networks”
Today, most data is being managed
in traditional relational databases.
These databases have trouble scaling
to accommodate the increase in data
volume, variety, and complexity. It is
also difficult and expensive for these
relational databases to provide
answers to clinical questions in a
timely manner. These relational
databases are not a viable solution
for solving this new category of data
challenges. KnowMED web site.
While traditional, relational data warehousing and
federation approaches can scale well and are
effective for many core data storage and access
requirements, such approaches often fail when
facing the dynamic changes and the inherent
complexity of data integration requirements for
Healthcare / Life Sciences (HCLS) research.
Semantic integration methods assure coherence,
harmonize synonyms and different terminologies,
and provide an extensible, flexible data integration
platform and interactive knowledge base for
relevant network analysis.
-IO Informatics
Drivers for a Paradigm Shift
80% of data being generated is unstructured. *
*Dennis McCatherty, Baseline magazine
**CFO.com, http://www3.cfo.com/article/2012/2/analytics_big-data-business-intelligence
● Hobbled analytics
● Poor collaboration
● Opacity
● TCO
● Performance
11:1**
Silo
IoT
$$$$
Barriers
●
Financial
●
Practical
●
Technical
●
Making the business case
Enablers
●
Necessity - W3C Use Cases
●
Tools
●
Accelerators & Incubators
●
Awareness
●
Money
Community & Regulatory
●
DTC – Data Transparency Coalition (Data Transparency 2013, Sep 11-12)
●
UDI – Unique Device Identification (FDA)
●
ACA – Affordable Care Act (administrative simplification clause)
●
FITA – Foundation for Information Technology Accessibility
●
LEI – Legal Entity Identifier (mandate)
●
FIBO – Financial Industry Business Ontology (Dodd-Frank)
●
ACT – Advisory Committee on Transparency (Sunlight Foundation)
●
Schema.org
●
DATA – Digital Accountability and Transparency Act
●
CMS – Centers for Medicaid and Medicare Services
●
W3C eGovernment
●
Executive Order
Specifically, this Memorandum requires agencies to collect or create information in a way that supports downstream
information processing and dissemination activities. This includes using machine readable and open formats, data standards,
and common core and extensible metadata for all new information creation and collection efforts. OMB Memorandum,
M-13-13, May 9, 2013
Pew Study, May 2010
LOD vs Graphs
Open Data Certificate
Raw Pilot
Standard Expert
Unintended consequences (of the good
kind)
●
Other domains and
business use cases
–
Supply chain
–
Philanthropy
–
Social
–
Investor relations
–
Word search
–
Real Estate
–
Arts & humanities
–
E-Commerce (“perfect
capitalism”¹)
–
Education
²Google Semantic Search by David Amerland
¹The Physics of the Future by Michio Kaku
● Other benefits
– SEO - “The Knowledge Graph does
away with the inherent ambiguity of
conventional search at Google's
search box”²
– Accessible community driven
content
– Unleash the full power of “big data”
– Unanticipated reuse
The Way Forward
Get the word out
Facilitate ontology development
Build POC's Nurture community and industry involvement
Innovate!
LOD Goes Global
Further Reading
●
Use Cases
●
Linked Data e-book
●
Semantic Community
●
Web Science Trust
●
STI International
●
Schema.org
●
semanticweb.com
●
World Wide Web Consortium
●
Depository Trust and Clearing Corporation
●
Freebase
●
Jena
Terminology
Term Definition
RDF Resource Description Framework, a way of representing information on the web (ie, a set of
statements), successor to XML, provides the ontology syntax (framework) for describing
resources
OWL Web Ontology Language - Standard framework for describing ontologies, based upon XML for
the low-level syntax and RDF/RDFS for the mid-level syntax
RDFS RDF Schema, recommended structure for RDF resources
R2RML Relational to RDF Mapping Language - W3C sanctioned language for expressing customized
mappings from relational databases to RDF data sets, D2RQ is the same thing in open source
(not as capable)
Open/Closed World Open: Any statement not explicitly known to be true or false is considered unknown (OWL).
Closed: any statement that is not known to be true is false (SQL).
Concept Description logic synonym for “Class”
Jena Apache foundation PPMC (Podling Project Management Committee) to develop a semantic
app framework (Java API)
RIF Rule Interchange Format – W3C recommendation
SPARQL SPARQL Protocol and RDF Query Language – query language for RDF data stores, an
“endpoint” being a resource that can accept such queries and return results
SWRL Semantic Web Rule Language – combines OWL & RML
FIBO Financial Industry Business Ontology
SBVR Semantics of Business Vocabulary and Rules - OMG developed human readable business
modeling language
OMG Object Management Group - Nonprofit industry consortium dedicated to data interoperability
(ex. UML, MDA, CORBA)
OASIS Organization for the Advancement of Structured Information Standards - Standards body for
web services, e-commerce, supply-chain and many other domains (including
FacetApp SmartData Platform
OntoApp
Toolkit
FacetApp Platform
Client AppClient App
OntologyOntology
Client AppClient App
OntologyOntology
Enterprise Applications
OntoApp
ServerServer
OntoApp
ServerServer
InferenceInference
ServerServer
InferenceInference
ServerServer
Inferen
ce
Engine
Inferen
ce
Engine
Harness Reusable Objects for
Fast, Scalable Model-driven Enterprise Apps
Configure enterprise applications in a model vs. programming low level source code.
Copyright 2013 FacetApp LLC, Confidential and Proprietary
SmartData Breakthrough
Maintained by One Model
Application DataApplication Data
Application LogicApplication Logic
Technical Programmers
Database Admins
Data Scientist
Focus is where it should be – on maximizing ROI
S&P Findings.
Traditional
Semantic App
Development
TestTestTestTest
BuildBuildBuildBuild
DesignDesignDesignDesign
TestTestTestTest
BuildBuildBuildBuild
DesignDesignDesignDesign
"FacetApp provided conservatively, a cost savings of 30% 
and depending on the situation could have returned 70% 
based on our experience last month."
­ Gene Mishchenko
­ Lead Info Architect, S&P Ratings 
Services
"FacetApp provided conservatively, a cost savings of 30% 
and depending on the situation could have returned 70% 
based on our experience last month."
­ Gene Mishchenko
­ Lead Info Architect, S&P Ratings 
Services
3000+ hrs 1000 hrs
$550k $250k
*Labor Component Only.
FacetApp API
Thank You!
Tony Dobaj (doe-bye)
apdobaj@facetapp.com
720-277-2052

More Related Content

PDF
Designing a second generation of open data platforms
PDF
Big Data Mining - Classification, Techniques and Issues
PPT
Broad Data
DOCX
Web Mining
PPTX
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
PPTX
Semantics for Big Data Integration and Analysis
PDF
How Semantics Solves Big Data Challenges
DOC
Notes for talk on 12th June 2013 to Open Innovation meeting, Glasgow
Designing a second generation of open data platforms
Big Data Mining - Classification, Techniques and Issues
Broad Data
Web Mining
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Semantics for Big Data Integration and Analysis
How Semantics Solves Big Data Challenges
Notes for talk on 12th June 2013 to Open Innovation meeting, Glasgow

What's hot (20)

PPT
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
PDF
Open Linked Data as Part of a Government Enterprise Architecture
PDF
Building and Using a Knowledge Graph to Combat Human Trafficking
PDF
Big data Paper
PDF
Knowledge Graphs - The Power of Graph-Based Search
PDF
Memory Connected
PPTX
Linked data for Enterprise Data Integration
PPTX
Describing Scholarly Contributions semantically with the Open Research Knowle...
PPTX
Hadoop World 2011: The Hadoop Award for Government Excellence - Bob Gourley -...
PDF
Building Knowledge Graphs in DIG
PPTX
3 pillars of big data : structured data, semi structured data and unstructure...
PDF
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
PPTX
Towards an Open Research Knowledge Graph
PDF
Linked Open Government Data: What’s Next?
PPTX
Cognitive data
PDF
How to build and run a big data platform in the 21st century
PPT
DCMI Keynote: Bridging the Semantic Gaps and Interoperability
PPTX
Rdaeu russia_fg_1_july2014_final
PDF
Linked Data at the OU - the story so far
PDF
Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Open Linked Data as Part of a Government Enterprise Architecture
Building and Using a Knowledge Graph to Combat Human Trafficking
Big data Paper
Knowledge Graphs - The Power of Graph-Based Search
Memory Connected
Linked data for Enterprise Data Integration
Describing Scholarly Contributions semantically with the Open Research Knowle...
Hadoop World 2011: The Hadoop Award for Government Excellence - Bob Gourley -...
Building Knowledge Graphs in DIG
3 pillars of big data : structured data, semi structured data and unstructure...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Towards an Open Research Knowledge Graph
Linked Open Government Data: What’s Next?
Cognitive data
How to build and run a big data platform in the 21st century
DCMI Keynote: Bridging the Semantic Gaps and Interoperability
Rdaeu russia_fg_1_july2014_final
Linked Data at the OU - the story so far
Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...

Viewers also liked (7)

PDF
Argubot Academy
PPTX
Remix and Video
PPT
Fall 08 AMA Slideshow
PPT
Social Media Orlando Training
PPTX
An impact of radio frequency identityfication on supply.ppt
PPT
Nla Presentation
PPT
Psychoneuroimmunology
Argubot Academy
Remix and Video
Fall 08 AMA Slideshow
Social Media Orlando Training
An impact of radio frequency identityfication on supply.ppt
Nla Presentation
Psychoneuroimmunology

Similar to The technical case for a semantic web (20)

ODT
Riding The Semantic Wave
ODP
State of the Semantic Web
PPTX
CSHALS 2010 W3C Semanic Web Tutorial
PPT
Semantic web an overview and projects
PPTX
Semantic Web questions we couldn't ask 10 years ago
PDF
2018 GIS in Development: Semantic Web
PPT
George thomas gtra2010
PPTX
SWT Lecture Session 1 - Introduction
PDF
Contextual Computing - Knowledge Graphs & Web of Entities
KEY
Semantic Web and Linked Open Data
PPT
Netflix presentation final
PDF
Session 0.0 poster minutes madness
PDF
WebGUI And The Semantic Web
PPT
Introduction to Semantic Web for GIS Practitioners
PPTX
The Web of Data: do we actually understand what we built?
PDF
NetIKX Semantic Search Presentation
PPTX
sOCIAL NETWORK ANALYSIS AND ONTOLOGIES A VIEW
PPTX
WTF is the Semantic Web
PPT
Future of Web 2.0 & The Semantic Web
PPTX
Making things findable
Riding The Semantic Wave
State of the Semantic Web
CSHALS 2010 W3C Semanic Web Tutorial
Semantic web an overview and projects
Semantic Web questions we couldn't ask 10 years ago
2018 GIS in Development: Semantic Web
George thomas gtra2010
SWT Lecture Session 1 - Introduction
Contextual Computing - Knowledge Graphs & Web of Entities
Semantic Web and Linked Open Data
Netflix presentation final
Session 0.0 poster minutes madness
WebGUI And The Semantic Web
Introduction to Semantic Web for GIS Practitioners
The Web of Data: do we actually understand what we built?
NetIKX Semantic Search Presentation
sOCIAL NETWORK ANALYSIS AND ONTOLOGIES A VIEW
WTF is the Semantic Web
Future of Web 2.0 & The Semantic Web
Making things findable

Recently uploaded (20)

PPTX
From Curiosity to ROI — Cost-Benefit Analysis of Agentic Automation [3/6]
PPTX
How to use fields_get method in Odoo 18
PDF
Optimizing bioinformatics applications: a novel approach with human protein d...
PDF
Applying Agentic AI in Enterprise Automation
PDF
EGCB_Solar_Project_Presentation_and Finalcial Analysis.pdf
PDF
Examining Bias in AI Generated News Content.pdf
PDF
Uncertainty-aware contextual multi-armed bandits for recommendations in e-com...
PDF
Ebook - The Future of AI A Comprehensive Guide.pdf
PPTX
CRM(Customer Relationship Managmnet) Presentation
PDF
Be ready for tomorrow’s needs with a longer-lasting, higher-performing PC
PPTX
Strategic Picks — Prioritising the Right Agentic Use Cases [2/6]
PPTX
Presentation - Principles of Instructional Design.pptx
PPTX
Blending method and technology for hydrogen.pptx
PDF
Domain-specific knowledge and context in large language models: challenges, c...
PPTX
Information-Technology-in-Human-Society (2).pptx
PDF
【AI論文解説】高速・高品質な生成を実現するFlow Map Models(Part 1~3)
PDF
TrustArc Webinar - Data Minimization in Practice_ Reducing Risk, Enhancing Co...
PDF
GDG Cloud Southlake #45: Patrick Debois: The Impact of GenAI on Development a...
PDF
State of AI in Business 2025 - MIT NANDA
PDF
substrate PowerPoint Presentation basic one
From Curiosity to ROI — Cost-Benefit Analysis of Agentic Automation [3/6]
How to use fields_get method in Odoo 18
Optimizing bioinformatics applications: a novel approach with human protein d...
Applying Agentic AI in Enterprise Automation
EGCB_Solar_Project_Presentation_and Finalcial Analysis.pdf
Examining Bias in AI Generated News Content.pdf
Uncertainty-aware contextual multi-armed bandits for recommendations in e-com...
Ebook - The Future of AI A Comprehensive Guide.pdf
CRM(Customer Relationship Managmnet) Presentation
Be ready for tomorrow’s needs with a longer-lasting, higher-performing PC
Strategic Picks — Prioritising the Right Agentic Use Cases [2/6]
Presentation - Principles of Instructional Design.pptx
Blending method and technology for hydrogen.pptx
Domain-specific knowledge and context in large language models: challenges, c...
Information-Technology-in-Human-Society (2).pptx
【AI論文解説】高速・高品質な生成を実現するFlow Map Models(Part 1~3)
TrustArc Webinar - Data Minimization in Practice_ Reducing Risk, Enhancing Co...
GDG Cloud Southlake #45: Patrick Debois: The Impact of GenAI on Development a...
State of AI in Business 2025 - MIT NANDA
substrate PowerPoint Presentation basic one

The technical case for a semantic web

  • 1. A Semantic Web Primer ™ In partnership with
  • 2. History & Perspective “We might hope to see the finances of the Union as clear and intelligible as a merchant's books, so that every member of Congress and every man of any mind in the Union should be able to comprehend them, to investigate abuses, and consequently to control them.” Thomas Jefferson, 1802
  • 3. History & Perspective Web 1.0  Read Only  Visual  Human Presentation Web 2.0  Interactive  Social  Applications Web 3.0 “Web of Data”
  • 4. Graph Theory Problem: Traverse each bridge once and only once in touring the city Solution: Zero or two nodes of odd degree Konigsburg, Prussia, 1735
  • 5. Web 3.0 – The GGG RelFinder SpaceTimeFacebook Graph Search Diseasome Linked Life Data iDrove.itEdamam TBL 2009
  • 6. Google Gets It Google Knowledge Graph Geosearch Demo The Knowledge Graph
  • 7. The Resource Description Framework (RDF) http://www.example.org/~joe/contact.rdf#joesmith http://www.example.org/~joe/ “Object” resource (joe's home page) “Subject” resource (joesmith in RDF) Can be a literal... http://xmlns.com/foaf/0.1/homepage “Predicate” resource (FOAF relationship) URL URN URI
  • 8. Story in RDBMS ID Name Type Legs Fur 001 Harrison Human 2 No 002 Layla Dog 4 Yes 003 MyBall Ball N/A N/A Concept: Harrison has a dog friend named Layla whose favorite activity is chasing a ball ObjectTable ID1 ID2 001 002 002 001 FriendshipTable Type1 Type2 Human Mammal Dog Mammal Ball Toy ID1 ID2 002 003 SubclassTable FavActivityTable
  • 10. Story in OODBMS Concept: Harrison has a dog friend named Layla whose favorite activity is chasing a ball DogDog HumanHuman BallBall ToyToy 001001 002002 HarrisonHarrison LaylaLayla22 44 TrueTrueFalseFalse 003003 MammalMammal MyBallMyBall foaf:friend isA isA ofType ofType hasName numLegs hasFur hasName numLegs hasFur hasName ofType isA chases Class (Concept) Sub-Class Object (Individual) Attribute (Property)
  • 11. Atomic (axiomatic) Concepts ID Name Type Legs Fur 001 Harrison Human 2 No 002 Layla Dog 4 Yes 003 MyBall Ball N/A N/A ID1 ID2 002 003 Concept (dog) Instance (Layla) Favorite Activity Name RDBMS OODBMS
  • 12. Reuse Time to Implementation Diminishes with each Phase & Domain
  • 14. Quantitative B-Tree Index: M x O(log2 N) Graph: 1 x O(log2 N) – plus pointer traversalRef: Michel Domenjoud, blog.octo.com
  • 15. Qualitative Horizontal Scale (Complexity) VerticalScale(Size) Oliver, Andrew C. "Which Freaking Database Should I Use?" InfoWorld. InfoWorld, 02 Aug. 2012. Web. 07 Oct. 2013. Gonzalez, Rob. "Two Kinds of Big Data." Semanticweb.com. Semanticweb.com, 06 Sept. 2011. Web. 07 Oct. 2013. GGG
  • 16. The “Ontology” (Vocabulary) TaxonomyTaxonomy: Organizational hierarchy based upon similarities in properties OntologyOntology: Set of representational primitives (classes, attributes & relationships) with which to model a domain of knowledge or discourse* HumanHuman MammalMammal Novel Stringtitle Human Authorprofession Novel Authorcreator NameName NovelNovel DocDoc AuthorAuthor NameName Human Stringname Key pointKey point: The triple-store's construction enables the extraction of meaning through inference (description logic), ie, “Smart Data” *Encyclopedia of Database Systems, L. Liu et al, 2009
  • 17. Summary Aspect Web 2.0 Web 3.0 Unification Join Built-in Retrieval Keyword indexing Pointer traversal Query Keyword matching Inference Citizens Documents Data Polysemy & Synonymy Yes No Construct Table Triple (how) Descriptor URL - where URI = URL + URN (what) Data Aggregated Connected Flexibility Low High Dynamic Relationships Meaning
  • 19. Basic (very) Architecture Unstructured Data Structure-er (ETL) Triple-store (OOBDMS) Inference and rendering engine SPARQL Web Server
  • 20. Open Government Use Case Database Search ORM DCAT Data Portal .NET (wrapper) PHPPylons Framework Publish Aggregate Project 2011 2014 Consume
  • 21. Not a new idea... LOD Cloud, October 2007
  • 22. But It Is Growing Ref: lod-cloud.net
  • 24. Testimonials This inference capability makes both the journalist tagging and the triple store powered SPARQL queries simpler and indeed quicker than a traditional SQL approach. Dynamic aggregations based on inferred statements increase the quality and breadth of content across the site. The RDF triple approach also facilitates agile modeling, whereas traditional relational schema modeling is less flexible and also increases query complexity. -BBC Because there haven't been graph models in the market-leading relational databases, the graph approach has been mostly overlooked by corporate users. But it shouldn't be ignored anymore, he added: Graphs provide "a good representation of the data structure with which you're dealing." -Robin Bloor, the Bloor Group, July 2013 We believe that Graph Search is part of a trend that is much bigger than Facebook, and more widespread than search. Facebook is tapping into a fundamentally new way to exploit the information that exists in all the world’s databases. In this post, we will look at the Facebook announcement from a different angle, that of connected data: a growing trend that is on the verge of changing how companies large and small understand their data. -Andreas Kollegger, Jan 2013 The Web as we know it today will not be the Web as we know it tomorrow. The Web of today is oriented towards the universal accessibility of files (e.g. web pages, images). The Web of today can be thought of as a large-scale, distributed file system. The Web of tomorrow will encode any datum (e.g. strings, integers, dates). The Web of tomorrow can be thought of as a large-scale, distributed database. -Marko Rodriguez, Los Alamos, 2009 The construction and structure of graphs or networks is the key to understanding the complex world around us. -Albert Laszlo, “Linked: The New Science of Networks” Today, most data is being managed in traditional relational databases. These databases have trouble scaling to accommodate the increase in data volume, variety, and complexity. It is also difficult and expensive for these relational databases to provide answers to clinical questions in a timely manner. These relational databases are not a viable solution for solving this new category of data challenges. KnowMED web site. While traditional, relational data warehousing and federation approaches can scale well and are effective for many core data storage and access requirements, such approaches often fail when facing the dynamic changes and the inherent complexity of data integration requirements for Healthcare / Life Sciences (HCLS) research. Semantic integration methods assure coherence, harmonize synonyms and different terminologies, and provide an extensible, flexible data integration platform and interactive knowledge base for relevant network analysis. -IO Informatics
  • 25. Drivers for a Paradigm Shift 80% of data being generated is unstructured. * *Dennis McCatherty, Baseline magazine **CFO.com, http://www3.cfo.com/article/2012/2/analytics_big-data-business-intelligence ● Hobbled analytics ● Poor collaboration ● Opacity ● TCO ● Performance 11:1** Silo IoT $$$$
  • 27. Enablers ● Necessity - W3C Use Cases ● Tools ● Accelerators & Incubators ● Awareness ● Money
  • 28. Community & Regulatory ● DTC – Data Transparency Coalition (Data Transparency 2013, Sep 11-12) ● UDI – Unique Device Identification (FDA) ● ACA – Affordable Care Act (administrative simplification clause) ● FITA – Foundation for Information Technology Accessibility ● LEI – Legal Entity Identifier (mandate) ● FIBO – Financial Industry Business Ontology (Dodd-Frank) ● ACT – Advisory Committee on Transparency (Sunlight Foundation) ● Schema.org ● DATA – Digital Accountability and Transparency Act ● CMS – Centers for Medicaid and Medicare Services ● W3C eGovernment ● Executive Order Specifically, this Memorandum requires agencies to collect or create information in a way that supports downstream information processing and dissemination activities. This includes using machine readable and open formats, data standards, and common core and extensible metadata for all new information creation and collection efforts. OMB Memorandum, M-13-13, May 9, 2013
  • 30. LOD vs Graphs Open Data Certificate Raw Pilot Standard Expert
  • 31. Unintended consequences (of the good kind) ● Other domains and business use cases – Supply chain – Philanthropy – Social – Investor relations – Word search – Real Estate – Arts & humanities – E-Commerce (“perfect capitalism”¹) – Education ²Google Semantic Search by David Amerland ¹The Physics of the Future by Michio Kaku ● Other benefits – SEO - “The Knowledge Graph does away with the inherent ambiguity of conventional search at Google's search box”² – Accessible community driven content – Unleash the full power of “big data” – Unanticipated reuse
  • 32. The Way Forward Get the word out Facilitate ontology development Build POC's Nurture community and industry involvement Innovate! LOD Goes Global
  • 33. Further Reading ● Use Cases ● Linked Data e-book ● Semantic Community ● Web Science Trust ● STI International ● Schema.org ● semanticweb.com ● World Wide Web Consortium ● Depository Trust and Clearing Corporation ● Freebase ● Jena
  • 34. Terminology Term Definition RDF Resource Description Framework, a way of representing information on the web (ie, a set of statements), successor to XML, provides the ontology syntax (framework) for describing resources OWL Web Ontology Language - Standard framework for describing ontologies, based upon XML for the low-level syntax and RDF/RDFS for the mid-level syntax RDFS RDF Schema, recommended structure for RDF resources R2RML Relational to RDF Mapping Language - W3C sanctioned language for expressing customized mappings from relational databases to RDF data sets, D2RQ is the same thing in open source (not as capable) Open/Closed World Open: Any statement not explicitly known to be true or false is considered unknown (OWL). Closed: any statement that is not known to be true is false (SQL). Concept Description logic synonym for “Class” Jena Apache foundation PPMC (Podling Project Management Committee) to develop a semantic app framework (Java API) RIF Rule Interchange Format – W3C recommendation SPARQL SPARQL Protocol and RDF Query Language – query language for RDF data stores, an “endpoint” being a resource that can accept such queries and return results SWRL Semantic Web Rule Language – combines OWL & RML FIBO Financial Industry Business Ontology SBVR Semantics of Business Vocabulary and Rules - OMG developed human readable business modeling language OMG Object Management Group - Nonprofit industry consortium dedicated to data interoperability (ex. UML, MDA, CORBA) OASIS Organization for the Advancement of Structured Information Standards - Standards body for web services, e-commerce, supply-chain and many other domains (including
  • 35. FacetApp SmartData Platform OntoApp Toolkit FacetApp Platform Client AppClient App OntologyOntology Client AppClient App OntologyOntology Enterprise Applications OntoApp ServerServer OntoApp ServerServer InferenceInference ServerServer InferenceInference ServerServer Inferen ce Engine Inferen ce Engine Harness Reusable Objects for Fast, Scalable Model-driven Enterprise Apps Configure enterprise applications in a model vs. programming low level source code. Copyright 2013 FacetApp LLC, Confidential and Proprietary
  • 36. SmartData Breakthrough Maintained by One Model Application DataApplication Data Application LogicApplication Logic Technical Programmers Database Admins Data Scientist Focus is where it should be – on maximizing ROI
  • 39. Thank You! Tony Dobaj (doe-bye) [email protected] 720-277-2052