ORE & SWAP: Composition & Complexity Metadata Infrastructures, MPG eScience Seminar, Berlin
ORE & SWAP: Composition & Complexity Background/Foundations Web Architecture Resource Description Framework OAI Object Reuse & Exchange (OAI ORE) More Background/Foundations The DCMI Abstract Model Dublin Core Application Profiles Scholarly Works DC Application Profile (SWAP) Conclusions
Background:  Web Architecture, RDF &  (briefly) Linked Data
W3C Web Architecture: Identification Identifies Resource = anything of interest Resources are identified by URIs Resource URI
W3C Web Architecture: Representation Identifies Representation = data that encodes information about the state of a resource One resource may have multiple representations Representations may change over time Resource URI Representation 2 Represents Representation 1 Represents
W3C Web Architecture: Interaction Resources manipulated through exchange of representations Messages exchanged using network protocols Resource is always “hidden” Resource Client Representation 2 Represents Representation 1 Represents Content Negotiation Server request message (may include representation) response message (may include representation)
W3C Web Architecture: Interaction Client 1 Server GET /doc/d/ HTTP 1.1 Host: example.org Client 2 Server GET /doc/d/ HTTP 1.1 Host: example.org Accept: application/xhtml+xml HTTP/1.1 200 OK Content-Type:text/html <!DOCTYPE html PUBLIC &quot;-//W3C//DTD HTML 4.01//EN&quot;  HTTP/1.1 200 OK Content-Type:  application/xhtml+xml <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?>  <!DOCTYPE html PUBLIC &quot;-//W3C//DTD HTML 4.01//EN&quot;
Resource Description Framework (RDF) Set of specifications from W3C  A simple data model... ...for making assertions about relationships between resources “ Thing-X is-related-in-some-way-to Thing-Y” Types of relationship are properties Assertions made in the form of triples Subject, Predicate, Object Sets of triples represented as graphs Serialisation in multiple formats, including RDF/XML
Resource Description Framework (RDF) Document D Person P Organisation O is created by is published by
Resource Description Framework (RDF) RDF designed for use on Web Global context, global naming: URIs  URIs as names for things described (subject, object) URIs as names of relationships between things (predicate) URI-based extensibility Also allows for  “ unnamed” things: “blank nodes” (subject, object) “ literals” = text strings (object only) Formal semantics defines rules for  comparing & merging sets of triples logical inferencing
Resource Description Framework (RDF) http://ex.org/doc/d http://ex.org/per/p http://ex.org/org/o http://purl.org/dc/terms/creator http://purl.org/dc/terms/publisher URIs as subjects & objects URIs as predicates Subject Predicate Object http://ex.org/doc/d http://purl.org/dc/terms/creator http://ex.org/per/p http://ex.org/doc/d http:// purl.org/dc/terms/publisher http://ex.org/org/o
Resource Description Framework (RDF) document:d person:p organisation:o dcterms:creator dcterms:publisher URIs often abbreviated as prefixed names in RDF syntaxes Literals as objects “ Andy Powell” “ DCMI” foaf:name foaf:name
Resource Description Framework (RDF) document:d person:p organisation:o dcterms:creator dcterms:publisher “ Andy Powell” “ DCMI” foaf:mbox foaf:name person:p mailto:andy.powell @example.org http://example.org/doc/andy/ foaf:name foaf:homepage Two graphs with common nodes….
Resource Description Framework (RDF) document:d person:p organisation:o dcterms:creator dcterms:publisher “ Andy Powell” “ DCMI” foaf:name foaf:name mailto:andy.powell @example.org http://example.org/doc/andy/ foaf:mbox foaf:homepage …  can be merged
Web Architecture: httpRange-14  Widespread use of http URIs as names for things other than documents, while also serving representation Problem: does URI identify thing or document? Solution: the W3C TAG httpRange-14 resolution If server responds to GET with 2xx, then resource is information resource (document) If server responds to GET with 303, then resource could be anything Make a distinction between  information resources (Web documents), with URI, with representation “ non-information resources” (“real world things”), with URI, with no representation, but with description Sauermann, Cyganiak,  Cool URIs for the Semantic Web W3C Interest Group Note, 31 March 2008 http://www.w3.org/TR/2008/NOTE-cooluris-20080331/
W3C Web Architecture: httpRange-14 HTTP/1.1  303 See Other Location: http://example.org/doc/per/p/ Client Server GET /per/p/ HTTP 1.1 Host: example.org Client Server GET /doc/per/p/ HTTP 1.1 Host: example.org HTTP/1.1  200 OK Content-Type:  application/xhtml+xml <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?>  <!DOCTYPE html PUBLIC &quot;-//W3C//DTD HTML 4.01//EN&quot;
Linked Data Berners-Lee’s Linked Data principles Design Issues Note (2006) http://www.w3.org/DesignIssues/LinkedData.html Use URIs as names for things. Use http URIs so that people can look up those names. When someone looks up a URI, provide useful information. Include links to other URIs. So that they can discover more things
How to Publish Linked Data on the Web Bizer, Cyganiak, Heath:  How to Publish Linked Data on the Web  (2007) http://sites.wiwiss.fu-berlin.de/suhl/bizer/pub/LinkedDataTutorial/   Provides guidelines on subset of RDF features (for merging & querying) choosing URIs, linking and using URI aliases “ useful information to return” description, backlinks, related descriptions, metadata recipes for serving data
Open Archives Initiative Object Reuse & Exchange (OAI ORE) With thanks to Herbert Van de Sompel (Los Alamos National Laboratory, USA) for several of the slides used in this section
Open Archives Initiative Object Reuse & Exchange (ORE) Project of Open Archives Initiative  http:// www.openarchives.org /ore/ Funded by The Andrew W. Mellon Foundation Additional support from The Coalition for Networked Information Joint Information Systems Committee Microsoft Corporation The National Science Foundation Co-ordinated by Carl Lagoze & Herbert Van de Sompel
The problem ORE tries to solve Often useful to conceptualise (scholarly and other) resources as grouped into aggregations Typically, way in which resources made available on Web does not make aggregate nature of resource explicit presents component/member resources as linked in some way, but without making clear how related introduces ambiguity between identification of set and identification of component/member
Aggregations! http://arxiv.org/abs/astro-ph/0611775 Formats Versions Identifiers Relationships Splash page
The Web
An Aggregation and the Web Resources of an Aggregation are distinct URI-identified Web resources Missing are: The boundary that delineates the Aggregation in the Web An identity (URI) for the Aggregation
The ORE approach Make Aggregations resources, identified by URIs Create RDF-based descriptions of Aggregations (& their component resources) relationships between Aggregation & component resources attributes of Aggregation, component resources  relationships between Aggregation, component resources and other resources Make Resource Maps resources, identified by URIs include attributes of Resource Map relationships between Resource Map and other resources Make Resource Maps available using conventions of Web Arch/Cool URIs
Publish a Resource Map to the Web
The Resource Map Identifies and Describes the Aggregation
The Resource Map and Aggregation as (distinct) Web resources
Constructing an ORE Resource Map
Some resources that belong together
Introduce the Aggregation This resource is an Aggregation
Express the  ore:aggregates  relationship The inverse is  ore:isAggregatedBy
Introduce the Resource Map
Express the  ore:describes  relationship
The  ore:isDescribedBy  relationship The inverse is  ore:isDescribedBy ; subproperty of  rdfs:seeAlso
Express metadata about the Resource Map This corresponds to  metadata  from the Linked Data recommendations
A Resource Map can “say” a lot … This corresponds to the  description ,  related descriptions ,  backlinks ,  metadata  from the Linked Data recommendations
But minimally it “says” this … This corresponds to the  description  (minimal), and  metadata  from the Linked Data recommendations
Proxy: a Resource in the Context of an Aggregation ore:proxyFor  and  ore:proxyIn  to introduce a Proxy for an Aggregated Resource
ore:lineage  An Aggregated Resource  originated  in  another Aggregation ore:lineage  is a relationship between Proxies
The ORE model in summary
Representing an ORE Resource Map An ORE Resource Map is an RDF Graph, so can be represented using any RDF syntax ORE provides guidelines for using RDF/XML RDFa Also guidelines for representing ORE Resource Map as Atom Entry Document With GRDDL transformation available
Publishing an ORE Resource Map ORE Aggregation as “non-information resource”/”real-world thing” ORE Resource Map as “information resource”/”Web document” ORE recommends selected HTTP “recipes” from  Cool URIs for the Semantic Web “ Hash URI” Simple 303 redirect to URI of Resource Map Content negotiation + 303 redirect to URI of Resource Map
Content negotiation with 303 redirect
Summary ORE addresses problem of describing Aggregations of resources Tries to do so in “resource-centric” way (c.f. OAI-PMH) Tries to cater for both Semantic Web contexts and Atom contexts Agnostic about nature of aggregated resources other relationships between aggregations, aggregated resources The least we can specify whilst being useful” (Jim Downing) Generic; potentially, broadly applicable Will typically be used alongside other vocabularies, models?
Applications in e-Science domain Applicable to any Aggregation! e.g. Transfer/migration/archiving of “packages” of resources? Collected outputs of activity as Aggregations? Thesis + datasets as Aggregations (THEOREM project, Cambridge) Papers + chemistry data (ORECHEM) And more….
More Background: The DCMI Abstract Model and Dublin Core Application Profiles
Why DCAM? Dublin Core in c2003 Metadata vocabularies …  but what is a DC “element”? …  what sort of “terms” are used in DC metadata? Syntax independence & encoding guidelines …  but what is being “encoded”? “ Dublin Core”? “Simple” and “Qualified” DC …  vocabularies? …  formats? (e.g. oai_dc) …  constraints on use of vocabularies? On which vocabularies? Grammatical Principles (including “1-to-1 rule”) …  fairly informal  Relationship between DC & Resource Description Framework history of co-evolution
DCMI Abstract Model Work by DCMI Architecture WG from mid-2003, initiated by Andy Powell (UKOLN, Eduserv) Second Version, DCMI Recommendation, 2007-06-04 http://dublincore.org/documents/2007/06/04/abstract-model/ Based on RDF Semantics Binary relations between resources Use of URIs RDF Schema inferencing DCAM uses the terminology Described resource   Property  = type of relationship Value  = other resource Introduces different “description model” “ Description set” as a community-specific “view” of an RDF Graph
Description Description Statement Property URI Statement Property URI Statement Property URI Statement Property URI Description Set Resource URI Resource URI Non-Literal Value Surrogate Non-Literal Value Surrogate Literal Value Surrogate Literal Value Surrogate
Resource URI Resource URI Value URI Description Statement Property URI Description Set Literal Value Surrogate Description Statement Property URI Non-Literal Value Surrogate Statement Property URI Non-Literal Value Surrogate Vocab Enc Scheme URI Value URI Value string Value string Syntax Enc Scheme URI Language Value string Language Statement Property URI Literal Value Surrogate Value string Syntax Enc Scheme URI
Description Statement Property URI Description Set Resource URI Literal Value Surrogate Description Statement Property URI Resource URI Non-Literal Value Surrogate Statement Property URI Non-Literal Value Surrogate Value URI Vocab Enc Scheme URI Value URI Value string Value string Syntax Enc Scheme URI Language Value string Language Statement Property URI Literal Value Surrogate Value string Syntax Enc Scheme URI
Resource URI Resource URI Value URI Description Statement Property URI Description Set Literal Value Surrogate Description Statement Property URI Non-Literal Value Surrogate Statement Property URI Non-Literal Value Surrogate Vocab Enc Scheme URI Value URI Value string Value string Syntax Enc Scheme URI Language Value string Language Statement Property URI Literal Value Surrogate Value string Syntax Enc Scheme URI
Resource  URI Property URI Value URI Resource  URI Value URI Property URI Property URI VES URI “ literal” @lang “ literal” ^^SES URI “ literal” @lang “ literal” ^^SES URI Property URI dcam:memberOf rdf:value
Resource  URI Property URI Value URI Resource  URI Value URI Property URI Property URI VES URI “ literal” @lang “ literal” ^^SES URI “ literal” @lang “ literal” ^^SES URI Property URI dcam:memberOf rdf:value
Description Set Description Statement Statement <http:/purl.org/dc/terms/subject> Non-Literal Value Surrogate Non-Literal Value Surrogate <http://example.org/terms/mySH> “ Metadata” &quot;Métadonnées&quot;   en fr <http://purl.org/dc/terms/publisher> <http://dublincore.org/documents/2007/06/04/abstract-model/> <http://example.org/org/DCMI> Property URI Value URI <http://example.org/mySH/h123> Value URI Property URI Vocab Enc Scheme URI Value String Value String Description Statement <http://example.org/org/DCMI> <http://xmlns.com/foaf/ 0.1/name> Literal Value Surrogate “ Dublin Core Metadata Initiative” en Value String Property URI Example: Description of document, description of publisher Statement <http://purl.org/dc/ terms/created> Literal Value Surrogate “ 1995” ^^xsd:gyear Value String Property URI
http://dublincore.org/.../ abstract-model/ dcterms:publisher http://example.org/ mySH/h123 http://example.org/ org/DCMI http://example.org/ org/DCMI dcterms:subject foaf:name http://example.org/ terms/mySH “ Metadata” @en “ Métadonnées” @fr “ Dublin Core…” @en “ 1995” ^^xsd:gyear dcterms:created dcam:memberOf rdf:value
http://dublincore.org/.../ abstract-model/ dcterms:publisher http://example.org/ mySH/h123 http://example.org/ org/DCMI http://example.org/ org/DCMI dcterms:subject foaf:name http://example.org/ terms/mySH “ Metadata” @en “ Métadonnées” @fr “ Dublin Core…” @en “ 1995” ^^xsd:gyear dcterms:created dcam:memberOf rdf:value
Dublin Core Application Profile Specification of how to construct & deploy description sets (descriptions, statements)…  … to serve some purpose … meet some set of requirements At core, a profile of a “description set” a set of constraints on the description set based on E-R model of problem space Enables structural validation predictability for processing, querying etc -> tools
DCMI Description Set Profile (DSP) A way of describing  structural constraints  on a description set the resources that may be described by descriptions in the description set the properties that may be referenced in statements the ways a value surrogate may be given  Description templates, statement templates Model & XML Syntax for DSP Working draft by Mikael Nilsson ( Royal Institute of Technology, Sweden) http://dublincore.org/documents/2008/03/31/dc-dsp/
Description & Statement Templates < DescriptionTemplate   minOccurs=&quot;1&quot;  maxOccurs=&quot;1&quot;  standalone=&quot;yes&quot;> < StatementTemplate   minOccurs=&quot;1&quot;  maxOccurs=&quot;1“  type=&quot;literal&quot;> <Property>http://purl.org/dc/terms/title</Property> </StatementTemplate> …  </DescriptionTemplate>
Dublin Core Application Profile A DC Application Profile is “packet of documentation” which consists of: Functional requirements (desirable) Domain model (mandatory) Description Set Profile (DSP) (mandatory) Usage guidelines (optional) Encoding syntax guidelines (optional)
Foundation standards Domain standards Application Profile The “Singapore Framework”
Scholarly Works (ePrints) DC Application Profile (SWAP)
Scholarly Works DC Application Profile SWAP/eprints Application Profile http://www.ukoln.ac.uk/repositories/digirep/index/Eprints_Application_Profile Development funded by JISC, Summer 2006  Co-ordinated by Julie Allinson (UKOLN) & Andy Powell (Eduserv Foundation) Specific focus on &quot;eprint&quot;:  a ''scientific or scholarly research text'‘ (Budapest Open Access Initiative)‏ e.g. peer-reviewed journal article, preprint, working paper, thesis, book chapter, report, etc.
The problem SWAP tries to solve Emphasis on open access to research outputs eprints & descriptions of eprints typically made available on Web using “repository systems” usually supporting OAI-PMH Inconsistency/ambiguity over “what is being described”  eprint-as-“instance” /copy eprint-as-content made available in multiple “instances” eprint as set of related resources  Problems compounded by (over-)emphasis on use of “Simple DC”/oai_dc Difficult/impossible to  reliably access full-text determine relationships between “instances” e.g. revisions, translations etc
The SWAP approach Apply the Functional Requirements for Bibliographic Records (FRBR) model to the case of eprints Make Works, Expressions, Manifestations & Items resources, identified by URIs Create DCAM-based descriptions of Works, Expressions, Manifestations & Items relationships between W, E, M, I attributes of W, E, M, I  relationships between W, E, M, I & Agents Mostly neutral on use of protocols designed with OAI-PMH in mind but does take a resource-centric approach
Components delivered Functional requirements specification “ Domain model”  Based on subset of FRBR The “Scholarly Works Description Set Profile&quot; plus human-readable commentary, usage guidelines New vocabularies of metadata terms With URIs like http://purl.org/eprint/terms/xyz Eprints DC-XML   XML format Based on work in progress within DCMI at time
Report of IFLA Study Group, 1998 Entity-Relational model for the “world” that  bibliographic records  describe FRBR models the world using 4 key entities (Group 1 Entities): a  work  is a distinct intellectual or artistic creation. A work is an abstract entity  an  expression  is the intellectual or artistic realization of a work a  manifestation  is the physical embodiment of an expression of a work  an  item  is a single exemplar of a manifestation. The entity defined as item is a concrete entity Primary relationships  Work -- is realized through --> Expression  Expression -- is embodied in --> Manifestation  Manifestation -- is exemplified by --> Item  Functional Requirements for Bibliographic Records (FRBR)
FRBR Group 1 Entities Work Expression 1..∞ isRealisedThrough Manifestation isEmbodiedIn ∞ ..∞ Copy isExemplifiedBy 1..∞
Work-Work Relationships Successor, Supplement, Adaptation etc Whole-Part Expression-Expression Relationships Abridgement, Revision, Translation etc Whole-Part Manifestation-Manifestation Relationships Reproduction, Alternate Whole-Part Item-Item Relationships Reconfiguration, Reproduction Whole-Part Functional Requirements for Bibliographic Records (FRBR)
Group 2 Entities: Person, Corporate body Responsibility relationships Work is-Created-By Person/CB Expression is-Realised-By Person/CB Manifestation is-Produced-By Person/CB Item is-Owned-By Person/CB Group 3 Entities: Concept, Object, Event and Place Subject relationships Work has-as-Subject Work/Expression/Manifestation/Item Work has-as-Subject Person/CB Work has-as-Subject Concept/Object/Event/Place Functional Requirements for Bibliographic Records (FRBR)
The SWAP Domain Model: Entities & Relationships ScholarlyWork SWAP adopts/extends a subset of the FRBR model Expression 0..∞ isExpressedAs Manifestation isManifestedAs 0..∞ Copy isAvailableAs 0..∞ 0..∞ 0..∞ isCreatedBy isPublishedBy 0..∞ isEditedBy 0..∞ isFundedBy isSupervisedBy AffiliatedInstitution Agent
The SWAP Domain Model: Attributes ScholarlyWork: title subject abstract affiliated institution identifier Agent: name type of agent date of birth mailbox homepage identifier Expression: title date available status version number language genre / type copyright holder bibliographic citation identifier Manifestation: format date modified Copy: date available access rights licence identifier
The SWAP Domain Model Expression isExpressedAs Expression isExpressedAs Manifestation Manifestation isManifestedAs isManifestedAs hasAdaptation ScholarlyWork hasVersion hasTranslation Copy isAvailableAs Copy isAvailableAs Copy isAvailableAs
Aggregations! http://arxiv.org/abs/astro-ph/0611775 This is only a tentative analysis! Manifestations/Items Other Expressions of same Work Expression Identifiers Relationships (Description of?) Expression?
The SWAP Description Set Profile Specifies “description templates” for descriptions of instances of the five entity types Scholarly Work, Expression, Manifestation, Copy (Item), Agent Specifies “statement templates” for statements within those descriptions e.g. a description of a Scholarly Work must contain a statement using the dc:title property with a literal value  e.g. when a description of a Scholarly Work contains a statement using the eprint:isExpressedAs property, a description of the value must follow the constraints of the “Expression” description template Small set of required data, much is optional
Thoughts on the Approach Clarity on “what is being described” Easier to rationalise ‘traditional’ and ‘modern’ citations traditional citations tend to refer to ‘expressions’ hypertext links tend to refer to ‘copies’/‘items’ Clarity on relationships between resources facilitates managing “versions” Relatively complex underlying model may be manifest in relatively simple cataloguer and/or end-user interfaces FRBR applicable to other “intellectual creations” Potential to merge with other data FRBR as key part of RDA initiative to update library cataloguing standards
Thoughts on the Approach However , limited implementation of SWAP to date (in JISC context) Inherent complexity? Fitting in to workflow? Compatibility with existing internal repository software models? See e.g.  Jenny Delasalle (Warwick University), “SWAP and e-prints structures don’t match” http://blogs.warwick.ac.uk/wrap/entry/swap_and_e-prints/ eprints.org model of “eprint” + “document” v SWAP Work, Expression, Manifestation & Item
Applications in e-Science domain Applicable to scientific eprints! SWAP was scoped to focus on eprints FRBR model more broadly applicable Current JISC-funded work on DCAPs for Geo-Spatial resources Still Images Time-Based Media Learning Materials (scoping study) Preliminary investigations on datasets Relationship of FRBR & aggregation models (ORE?) to description of datasets
Concluding thoughts
Some concluding thoughts Specifies set of properties to be used Requires core set of properties, allows for others Vocabularies Less explicit, doesn’t include the “describes” relationship, but resource-centric Explicitly aligns with Web Arch/httpRange-14 etc Deployment DCAM, layered on RDF RDF (with constraints) Description model Subset of FRBR model (several different relationship types); SWAP doesn’t address aggregation Aggregations (ore:aggregates) Domain model Scoped to specific resource type (scholarly works) (though FRBR broader) General in scope (though roots in scholarly communication domain) Scope Both seek to address problems of ambiguity of identification, explicitness of relationship types in (Semantic) Web-friendly ways Nature SWAP ORE
Acknowledgements The sections on Web Architecture and OAI ORE make use of material from a presentation by Herbert Van de Sompel, “An Introduction to the ORE interoperability framework”, presented at the 4 th  Search/Find Workshop, Ghent, Belgium, 22 August 2008 The section on SWAP adapts material from a presentation by Julie Allinson, “A Dublin Core Application Profile for Scholarly Works”, presented to the JISC CETIS Metadata & Digital Repositories SIG, Manchester, UK, 16 April 2007 Title slide photo “Iron ore flakes ”  by Flickr user lars hammar See  http://www.flickr.com/photos/7603557@N08/1195127627/ Made available under  CC Attribution- NonCommercial -Share-Alike  2.0 license
ORE & SWAP: Composition & Complexity Metadata Infrastructures, MPG eScience Seminar, Berlin

More Related Content

PPTX
Linked data for librarians
KEY
Semantic Web and Linked Open Data
PDF
Drupal and Apache Stanbol. What if you could reliably do autotagging?
PPTX
Linked Data for Czech Legislation
PPTX
Usage of Linked Data: Introduction and Application Scenarios
PDF
Drupal and Apache Stanbol
PPTX
Linked data HHS 2015
PDF
Introduction to RDF & SPARQL
Linked data for librarians
Semantic Web and Linked Open Data
Drupal and Apache Stanbol. What if you could reliably do autotagging?
Linked Data for Czech Legislation
Usage of Linked Data: Introduction and Application Scenarios
Drupal and Apache Stanbol
Linked data HHS 2015
Introduction to RDF & SPARQL

What's hot (20)

PDF
Linked data based semantic annotation using Drupal and Apache Stanbol
ODP
Linked Data
PPTX
RDFa Tutorial
ODP
Journalism and the Semantic Web
PPTX
Hack U Barcelona 2011
PPTX
Linked data MLA 2015
PPTX
Linked Data MLA 2015
PPTX
Querying Linked Data
PPT
Linked Data Tutorial
PPTX
Interaction with Linked Data
PPTX
Scaling up Linked Data
PPTX
Hacia la Internet del Futuro: Web Semántica y Open Linked Data, Parte 2
PDF
Introduction to RDF
ODP
State of the Semantic Web
PPTX
Providing Linked Data
PPT
A Semantic Data Model for Web Applications
PDF
Web of Data Usage Mining
PDF
Clinical Quality Linked Data on health.data.gov
PPTX
The Semantic Web #10 - SPARQL
PDF
Webinar: Semantic web for developers
Linked data based semantic annotation using Drupal and Apache Stanbol
Linked Data
RDFa Tutorial
Journalism and the Semantic Web
Hack U Barcelona 2011
Linked data MLA 2015
Linked Data MLA 2015
Querying Linked Data
Linked Data Tutorial
Interaction with Linked Data
Scaling up Linked Data
Hacia la Internet del Futuro: Web Semántica y Open Linked Data, Parte 2
Introduction to RDF
State of the Semantic Web
Providing Linked Data
A Semantic Data Model for Web Applications
Web of Data Usage Mining
Clinical Quality Linked Data on health.data.gov
The Semantic Web #10 - SPARQL
Webinar: Semantic web for developers

Similar to ORE and SWAP: Composition and Complexity (20)

PPT
Understanding RDF: the Resource Description Framework in Context (1999)
ODP
Web of data
PPT
An Introduction to the Open Archives Initiative Object Reuse and Exchange (OA...
PPT
The OAI ORE Project
PPT
A Resource-Oriented Approach to the Repository - OAI ORE TC Meeting, NYC, Jan...
PPT
Web Topics
PPTX
Sem webmaubeuge
PPT
Publishing data on the Semantic Web
PPTX
SNSW CO3.pptx
PPTX
PR and Web 3.0
KEY
RDF presentation at DrupalCon San Francisco 2010
PPT
Introduction To RDF and RDFS
PDF
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)
PDF
Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)
PPT
Semantic Web Applications
PDF
Engineering a Semantic Web (Spring 2018)
PPTX
Refactoring HUBzero for Linked Data
PDF
Ld4 dh tutorial
PPTX
Introduction to APIs and Linked Data
PPTX
Madrid Building blocks of Linked Data
Understanding RDF: the Resource Description Framework in Context (1999)
Web of data
An Introduction to the Open Archives Initiative Object Reuse and Exchange (OA...
The OAI ORE Project
A Resource-Oriented Approach to the Repository - OAI ORE TC Meeting, NYC, Jan...
Web Topics
Sem webmaubeuge
Publishing data on the Semantic Web
SNSW CO3.pptx
PR and Web 3.0
RDF presentation at DrupalCon San Francisco 2010
Introduction To RDF and RDFS
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)
Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)
Semantic Web Applications
Engineering a Semantic Web (Spring 2018)
Refactoring HUBzero for Linked Data
Ld4 dh tutorial
Introduction to APIs and Linked Data
Madrid Building blocks of Linked Data

More from Eduserv Foundation (20)

PPT
Virtual worlds, real learning? The uptake of virtual worlds for teaching and ...
PPT
Virtual worlds in education - a story so far
PPT
Audio in a social Web of linked data
PDF
OpenAthens LA 2.0: An introduction
PPT
The Eduserv Research Programme
PPT
Openid - an identity system for the open Web
PPT
Virtual worlds in education - a story so far
PPT
Using Second Life to support events
PPT
DC-2008 Tutorial: Basic Concepts
PPT
Both sides, now - are we builders or users of services in the cloud?
PPT
Does metadata matter?
PDF
Identity - Future Directions
PPT
Virtually where? Are 3-D virtual worlds such as Second Life having an impact ...
PPT
The JISC DC Application Profiles: Some thoughts on requirements and scope
PPT
Everything you wanted to know about Dublin Core metadata
PPT
Virtually educated - the reality of using Second Life and other virtual world...
PPT
Web 2.0 and repositories - have we got our repository architecture right?
PPT
Safer Children in a Digital World: a summary for children and young people
PPT
Digital Natives Run by Digital Immigrants: IT Services are Dead, Long Live IT...
PPT
Repositories thru the looking glass
Virtual worlds, real learning? The uptake of virtual worlds for teaching and ...
Virtual worlds in education - a story so far
Audio in a social Web of linked data
OpenAthens LA 2.0: An introduction
The Eduserv Research Programme
Openid - an identity system for the open Web
Virtual worlds in education - a story so far
Using Second Life to support events
DC-2008 Tutorial: Basic Concepts
Both sides, now - are we builders or users of services in the cloud?
Does metadata matter?
Identity - Future Directions
Virtually where? Are 3-D virtual worlds such as Second Life having an impact ...
The JISC DC Application Profiles: Some thoughts on requirements and scope
Everything you wanted to know about Dublin Core metadata
Virtually educated - the reality of using Second Life and other virtual world...
Web 2.0 and repositories - have we got our repository architecture right?
Safer Children in a Digital World: a summary for children and young people
Digital Natives Run by Digital Immigrants: IT Services are Dead, Long Live IT...
Repositories thru the looking glass

Recently uploaded (20)

PDF
Addressing the challenges of harmonizing law and artificial intelligence tech...
PDF
Examining Bias in AI Generated News Content.pdf
PDF
Streamline Vulnerability Management From Minimal Images to SBOMs
PPTX
Report in SIP_Distance_Learning_Technology_Impact.pptx
PDF
The AI Revolution in Customer Service - 2025
PPTX
Information-Technology-in-Human-Society (2).pptx
PDF
Altius execution marketplace concept.pdf
PDF
TicketRoot: Event Tech Solutions Deck 2025
PDF
Internet of Things (IoT) – Definition, Types, and Uses
PDF
Introduction to c language from lecture slides
PDF
GDG Cloud Southlake #45: Patrick Debois: The Impact of GenAI on Development a...
PDF
Uncertainty-aware contextual multi-armed bandits for recommendations in e-com...
PPTX
AQUEEL MUSHTAQUE FAKIH COMPUTER CENTER .
PDF
EIS-Webinar-Regulated-Industries-2025-08.pdf
PDF
Be ready for tomorrow’s needs with a longer-lasting, higher-performing PC
PDF
CCUS-as-the-Missing-Link-to-Net-Zero_AksCurious.pdf
PDF
State of AI in Business 2025 - MIT NANDA
PPTX
From Curiosity to ROI — Cost-Benefit Analysis of Agentic Automation [3/6]
PDF
FASHION-DRIVEN TEXTILES AS A CRYSTAL OF A NEW STREAM FOR STAKEHOLDER CAPITALI...
PDF
Ebook - The Future of AI A Comprehensive Guide.pdf
Addressing the challenges of harmonizing law and artificial intelligence tech...
Examining Bias in AI Generated News Content.pdf
Streamline Vulnerability Management From Minimal Images to SBOMs
Report in SIP_Distance_Learning_Technology_Impact.pptx
The AI Revolution in Customer Service - 2025
Information-Technology-in-Human-Society (2).pptx
Altius execution marketplace concept.pdf
TicketRoot: Event Tech Solutions Deck 2025
Internet of Things (IoT) – Definition, Types, and Uses
Introduction to c language from lecture slides
GDG Cloud Southlake #45: Patrick Debois: The Impact of GenAI on Development a...
Uncertainty-aware contextual multi-armed bandits for recommendations in e-com...
AQUEEL MUSHTAQUE FAKIH COMPUTER CENTER .
EIS-Webinar-Regulated-Industries-2025-08.pdf
Be ready for tomorrow’s needs with a longer-lasting, higher-performing PC
CCUS-as-the-Missing-Link-to-Net-Zero_AksCurious.pdf
State of AI in Business 2025 - MIT NANDA
From Curiosity to ROI — Cost-Benefit Analysis of Agentic Automation [3/6]
FASHION-DRIVEN TEXTILES AS A CRYSTAL OF A NEW STREAM FOR STAKEHOLDER CAPITALI...
Ebook - The Future of AI A Comprehensive Guide.pdf

ORE and SWAP: Composition and Complexity

  • 1. ORE & SWAP: Composition & Complexity Metadata Infrastructures, MPG eScience Seminar, Berlin
  • 2. ORE & SWAP: Composition & Complexity Background/Foundations Web Architecture Resource Description Framework OAI Object Reuse & Exchange (OAI ORE) More Background/Foundations The DCMI Abstract Model Dublin Core Application Profiles Scholarly Works DC Application Profile (SWAP) Conclusions
  • 3. Background: Web Architecture, RDF & (briefly) Linked Data
  • 4. W3C Web Architecture: Identification Identifies Resource = anything of interest Resources are identified by URIs Resource URI
  • 5. W3C Web Architecture: Representation Identifies Representation = data that encodes information about the state of a resource One resource may have multiple representations Representations may change over time Resource URI Representation 2 Represents Representation 1 Represents
  • 6. W3C Web Architecture: Interaction Resources manipulated through exchange of representations Messages exchanged using network protocols Resource is always “hidden” Resource Client Representation 2 Represents Representation 1 Represents Content Negotiation Server request message (may include representation) response message (may include representation)
  • 7. W3C Web Architecture: Interaction Client 1 Server GET /doc/d/ HTTP 1.1 Host: example.org Client 2 Server GET /doc/d/ HTTP 1.1 Host: example.org Accept: application/xhtml+xml HTTP/1.1 200 OK Content-Type:text/html <!DOCTYPE html PUBLIC &quot;-//W3C//DTD HTML 4.01//EN&quot; HTTP/1.1 200 OK Content-Type: application/xhtml+xml <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?> <!DOCTYPE html PUBLIC &quot;-//W3C//DTD HTML 4.01//EN&quot;
  • 8. Resource Description Framework (RDF) Set of specifications from W3C A simple data model... ...for making assertions about relationships between resources “ Thing-X is-related-in-some-way-to Thing-Y” Types of relationship are properties Assertions made in the form of triples Subject, Predicate, Object Sets of triples represented as graphs Serialisation in multiple formats, including RDF/XML
  • 9. Resource Description Framework (RDF) Document D Person P Organisation O is created by is published by
  • 10. Resource Description Framework (RDF) RDF designed for use on Web Global context, global naming: URIs URIs as names for things described (subject, object) URIs as names of relationships between things (predicate) URI-based extensibility Also allows for “ unnamed” things: “blank nodes” (subject, object) “ literals” = text strings (object only) Formal semantics defines rules for comparing & merging sets of triples logical inferencing
  • 11. Resource Description Framework (RDF) http://ex.org/doc/d http://ex.org/per/p http://ex.org/org/o http://purl.org/dc/terms/creator http://purl.org/dc/terms/publisher URIs as subjects & objects URIs as predicates Subject Predicate Object http://ex.org/doc/d http://purl.org/dc/terms/creator http://ex.org/per/p http://ex.org/doc/d http:// purl.org/dc/terms/publisher http://ex.org/org/o
  • 12. Resource Description Framework (RDF) document:d person:p organisation:o dcterms:creator dcterms:publisher URIs often abbreviated as prefixed names in RDF syntaxes Literals as objects “ Andy Powell” “ DCMI” foaf:name foaf:name
  • 13. Resource Description Framework (RDF) document:d person:p organisation:o dcterms:creator dcterms:publisher “ Andy Powell” “ DCMI” foaf:mbox foaf:name person:p mailto:andy.powell @example.org http://example.org/doc/andy/ foaf:name foaf:homepage Two graphs with common nodes….
  • 14. Resource Description Framework (RDF) document:d person:p organisation:o dcterms:creator dcterms:publisher “ Andy Powell” “ DCMI” foaf:name foaf:name mailto:andy.powell @example.org http://example.org/doc/andy/ foaf:mbox foaf:homepage … can be merged
  • 15. Web Architecture: httpRange-14 Widespread use of http URIs as names for things other than documents, while also serving representation Problem: does URI identify thing or document? Solution: the W3C TAG httpRange-14 resolution If server responds to GET with 2xx, then resource is information resource (document) If server responds to GET with 303, then resource could be anything Make a distinction between information resources (Web documents), with URI, with representation “ non-information resources” (“real world things”), with URI, with no representation, but with description Sauermann, Cyganiak, Cool URIs for the Semantic Web W3C Interest Group Note, 31 March 2008 http://www.w3.org/TR/2008/NOTE-cooluris-20080331/
  • 16. W3C Web Architecture: httpRange-14 HTTP/1.1 303 See Other Location: http://example.org/doc/per/p/ Client Server GET /per/p/ HTTP 1.1 Host: example.org Client Server GET /doc/per/p/ HTTP 1.1 Host: example.org HTTP/1.1 200 OK Content-Type: application/xhtml+xml <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?> <!DOCTYPE html PUBLIC &quot;-//W3C//DTD HTML 4.01//EN&quot;
  • 17. Linked Data Berners-Lee’s Linked Data principles Design Issues Note (2006) http://www.w3.org/DesignIssues/LinkedData.html Use URIs as names for things. Use http URIs so that people can look up those names. When someone looks up a URI, provide useful information. Include links to other URIs. So that they can discover more things
  • 18. How to Publish Linked Data on the Web Bizer, Cyganiak, Heath: How to Publish Linked Data on the Web (2007) http://sites.wiwiss.fu-berlin.de/suhl/bizer/pub/LinkedDataTutorial/ Provides guidelines on subset of RDF features (for merging & querying) choosing URIs, linking and using URI aliases “ useful information to return” description, backlinks, related descriptions, metadata recipes for serving data
  • 19. Open Archives Initiative Object Reuse & Exchange (OAI ORE) With thanks to Herbert Van de Sompel (Los Alamos National Laboratory, USA) for several of the slides used in this section
  • 20. Open Archives Initiative Object Reuse & Exchange (ORE) Project of Open Archives Initiative http:// www.openarchives.org /ore/ Funded by The Andrew W. Mellon Foundation Additional support from The Coalition for Networked Information Joint Information Systems Committee Microsoft Corporation The National Science Foundation Co-ordinated by Carl Lagoze & Herbert Van de Sompel
  • 21. The problem ORE tries to solve Often useful to conceptualise (scholarly and other) resources as grouped into aggregations Typically, way in which resources made available on Web does not make aggregate nature of resource explicit presents component/member resources as linked in some way, but without making clear how related introduces ambiguity between identification of set and identification of component/member
  • 24. An Aggregation and the Web Resources of an Aggregation are distinct URI-identified Web resources Missing are: The boundary that delineates the Aggregation in the Web An identity (URI) for the Aggregation
  • 25. The ORE approach Make Aggregations resources, identified by URIs Create RDF-based descriptions of Aggregations (& their component resources) relationships between Aggregation & component resources attributes of Aggregation, component resources relationships between Aggregation, component resources and other resources Make Resource Maps resources, identified by URIs include attributes of Resource Map relationships between Resource Map and other resources Make Resource Maps available using conventions of Web Arch/Cool URIs
  • 26. Publish a Resource Map to the Web
  • 27. The Resource Map Identifies and Describes the Aggregation
  • 28. The Resource Map and Aggregation as (distinct) Web resources
  • 29. Constructing an ORE Resource Map
  • 30. Some resources that belong together
  • 31. Introduce the Aggregation This resource is an Aggregation
  • 32. Express the ore:aggregates relationship The inverse is ore:isAggregatedBy
  • 34. Express the ore:describes relationship
  • 35. The ore:isDescribedBy relationship The inverse is ore:isDescribedBy ; subproperty of rdfs:seeAlso
  • 36. Express metadata about the Resource Map This corresponds to metadata from the Linked Data recommendations
  • 37. A Resource Map can “say” a lot … This corresponds to the description , related descriptions , backlinks , metadata from the Linked Data recommendations
  • 38. But minimally it “says” this … This corresponds to the description (minimal), and metadata from the Linked Data recommendations
  • 39. Proxy: a Resource in the Context of an Aggregation ore:proxyFor and ore:proxyIn to introduce a Proxy for an Aggregated Resource
  • 40. ore:lineage An Aggregated Resource originated in another Aggregation ore:lineage is a relationship between Proxies
  • 41. The ORE model in summary
  • 42. Representing an ORE Resource Map An ORE Resource Map is an RDF Graph, so can be represented using any RDF syntax ORE provides guidelines for using RDF/XML RDFa Also guidelines for representing ORE Resource Map as Atom Entry Document With GRDDL transformation available
  • 43. Publishing an ORE Resource Map ORE Aggregation as “non-information resource”/”real-world thing” ORE Resource Map as “information resource”/”Web document” ORE recommends selected HTTP “recipes” from Cool URIs for the Semantic Web “ Hash URI” Simple 303 redirect to URI of Resource Map Content negotiation + 303 redirect to URI of Resource Map
  • 44. Content negotiation with 303 redirect
  • 45. Summary ORE addresses problem of describing Aggregations of resources Tries to do so in “resource-centric” way (c.f. OAI-PMH) Tries to cater for both Semantic Web contexts and Atom contexts Agnostic about nature of aggregated resources other relationships between aggregations, aggregated resources The least we can specify whilst being useful” (Jim Downing) Generic; potentially, broadly applicable Will typically be used alongside other vocabularies, models?
  • 46. Applications in e-Science domain Applicable to any Aggregation! e.g. Transfer/migration/archiving of “packages” of resources? Collected outputs of activity as Aggregations? Thesis + datasets as Aggregations (THEOREM project, Cambridge) Papers + chemistry data (ORECHEM) And more….
  • 47. More Background: The DCMI Abstract Model and Dublin Core Application Profiles
  • 48. Why DCAM? Dublin Core in c2003 Metadata vocabularies … but what is a DC “element”? … what sort of “terms” are used in DC metadata? Syntax independence & encoding guidelines … but what is being “encoded”? “ Dublin Core”? “Simple” and “Qualified” DC … vocabularies? … formats? (e.g. oai_dc) … constraints on use of vocabularies? On which vocabularies? Grammatical Principles (including “1-to-1 rule”) … fairly informal Relationship between DC & Resource Description Framework history of co-evolution
  • 49. DCMI Abstract Model Work by DCMI Architecture WG from mid-2003, initiated by Andy Powell (UKOLN, Eduserv) Second Version, DCMI Recommendation, 2007-06-04 http://dublincore.org/documents/2007/06/04/abstract-model/ Based on RDF Semantics Binary relations between resources Use of URIs RDF Schema inferencing DCAM uses the terminology Described resource Property = type of relationship Value = other resource Introduces different “description model” “ Description set” as a community-specific “view” of an RDF Graph
  • 50. Description Description Statement Property URI Statement Property URI Statement Property URI Statement Property URI Description Set Resource URI Resource URI Non-Literal Value Surrogate Non-Literal Value Surrogate Literal Value Surrogate Literal Value Surrogate
  • 51. Resource URI Resource URI Value URI Description Statement Property URI Description Set Literal Value Surrogate Description Statement Property URI Non-Literal Value Surrogate Statement Property URI Non-Literal Value Surrogate Vocab Enc Scheme URI Value URI Value string Value string Syntax Enc Scheme URI Language Value string Language Statement Property URI Literal Value Surrogate Value string Syntax Enc Scheme URI
  • 52. Description Statement Property URI Description Set Resource URI Literal Value Surrogate Description Statement Property URI Resource URI Non-Literal Value Surrogate Statement Property URI Non-Literal Value Surrogate Value URI Vocab Enc Scheme URI Value URI Value string Value string Syntax Enc Scheme URI Language Value string Language Statement Property URI Literal Value Surrogate Value string Syntax Enc Scheme URI
  • 53. Resource URI Resource URI Value URI Description Statement Property URI Description Set Literal Value Surrogate Description Statement Property URI Non-Literal Value Surrogate Statement Property URI Non-Literal Value Surrogate Vocab Enc Scheme URI Value URI Value string Value string Syntax Enc Scheme URI Language Value string Language Statement Property URI Literal Value Surrogate Value string Syntax Enc Scheme URI
  • 54. Resource URI Property URI Value URI Resource URI Value URI Property URI Property URI VES URI “ literal” @lang “ literal” ^^SES URI “ literal” @lang “ literal” ^^SES URI Property URI dcam:memberOf rdf:value
  • 55. Resource URI Property URI Value URI Resource URI Value URI Property URI Property URI VES URI “ literal” @lang “ literal” ^^SES URI “ literal” @lang “ literal” ^^SES URI Property URI dcam:memberOf rdf:value
  • 56. Description Set Description Statement Statement <http:/purl.org/dc/terms/subject> Non-Literal Value Surrogate Non-Literal Value Surrogate <http://example.org/terms/mySH> “ Metadata” &quot;Métadonnées&quot; en fr <http://purl.org/dc/terms/publisher> <http://dublincore.org/documents/2007/06/04/abstract-model/> <http://example.org/org/DCMI> Property URI Value URI <http://example.org/mySH/h123> Value URI Property URI Vocab Enc Scheme URI Value String Value String Description Statement <http://example.org/org/DCMI> <http://xmlns.com/foaf/ 0.1/name> Literal Value Surrogate “ Dublin Core Metadata Initiative” en Value String Property URI Example: Description of document, description of publisher Statement <http://purl.org/dc/ terms/created> Literal Value Surrogate “ 1995” ^^xsd:gyear Value String Property URI
  • 57. http://dublincore.org/.../ abstract-model/ dcterms:publisher http://example.org/ mySH/h123 http://example.org/ org/DCMI http://example.org/ org/DCMI dcterms:subject foaf:name http://example.org/ terms/mySH “ Metadata” @en “ Métadonnées” @fr “ Dublin Core…” @en “ 1995” ^^xsd:gyear dcterms:created dcam:memberOf rdf:value
  • 58. http://dublincore.org/.../ abstract-model/ dcterms:publisher http://example.org/ mySH/h123 http://example.org/ org/DCMI http://example.org/ org/DCMI dcterms:subject foaf:name http://example.org/ terms/mySH “ Metadata” @en “ Métadonnées” @fr “ Dublin Core…” @en “ 1995” ^^xsd:gyear dcterms:created dcam:memberOf rdf:value
  • 59. Dublin Core Application Profile Specification of how to construct & deploy description sets (descriptions, statements)… … to serve some purpose … meet some set of requirements At core, a profile of a “description set” a set of constraints on the description set based on E-R model of problem space Enables structural validation predictability for processing, querying etc -> tools
  • 60. DCMI Description Set Profile (DSP) A way of describing structural constraints on a description set the resources that may be described by descriptions in the description set the properties that may be referenced in statements the ways a value surrogate may be given Description templates, statement templates Model & XML Syntax for DSP Working draft by Mikael Nilsson ( Royal Institute of Technology, Sweden) http://dublincore.org/documents/2008/03/31/dc-dsp/
  • 61. Description & Statement Templates < DescriptionTemplate minOccurs=&quot;1&quot; maxOccurs=&quot;1&quot; standalone=&quot;yes&quot;> < StatementTemplate minOccurs=&quot;1&quot; maxOccurs=&quot;1“ type=&quot;literal&quot;> <Property>http://purl.org/dc/terms/title</Property> </StatementTemplate> … </DescriptionTemplate>
  • 62. Dublin Core Application Profile A DC Application Profile is “packet of documentation” which consists of: Functional requirements (desirable) Domain model (mandatory) Description Set Profile (DSP) (mandatory) Usage guidelines (optional) Encoding syntax guidelines (optional)
  • 63. Foundation standards Domain standards Application Profile The “Singapore Framework”
  • 64. Scholarly Works (ePrints) DC Application Profile (SWAP)
  • 65. Scholarly Works DC Application Profile SWAP/eprints Application Profile http://www.ukoln.ac.uk/repositories/digirep/index/Eprints_Application_Profile Development funded by JISC, Summer 2006 Co-ordinated by Julie Allinson (UKOLN) & Andy Powell (Eduserv Foundation) Specific focus on &quot;eprint&quot;: a ''scientific or scholarly research text'‘ (Budapest Open Access Initiative)‏ e.g. peer-reviewed journal article, preprint, working paper, thesis, book chapter, report, etc.
  • 66. The problem SWAP tries to solve Emphasis on open access to research outputs eprints & descriptions of eprints typically made available on Web using “repository systems” usually supporting OAI-PMH Inconsistency/ambiguity over “what is being described” eprint-as-“instance” /copy eprint-as-content made available in multiple “instances” eprint as set of related resources Problems compounded by (over-)emphasis on use of “Simple DC”/oai_dc Difficult/impossible to reliably access full-text determine relationships between “instances” e.g. revisions, translations etc
  • 67. The SWAP approach Apply the Functional Requirements for Bibliographic Records (FRBR) model to the case of eprints Make Works, Expressions, Manifestations & Items resources, identified by URIs Create DCAM-based descriptions of Works, Expressions, Manifestations & Items relationships between W, E, M, I attributes of W, E, M, I relationships between W, E, M, I & Agents Mostly neutral on use of protocols designed with OAI-PMH in mind but does take a resource-centric approach
  • 68. Components delivered Functional requirements specification “ Domain model” Based on subset of FRBR The “Scholarly Works Description Set Profile&quot; plus human-readable commentary, usage guidelines New vocabularies of metadata terms With URIs like http://purl.org/eprint/terms/xyz Eprints DC-XML XML format Based on work in progress within DCMI at time
  • 69. Report of IFLA Study Group, 1998 Entity-Relational model for the “world” that bibliographic records describe FRBR models the world using 4 key entities (Group 1 Entities): a work is a distinct intellectual or artistic creation. A work is an abstract entity an expression is the intellectual or artistic realization of a work a manifestation is the physical embodiment of an expression of a work an item is a single exemplar of a manifestation. The entity defined as item is a concrete entity Primary relationships Work -- is realized through --> Expression Expression -- is embodied in --> Manifestation Manifestation -- is exemplified by --> Item Functional Requirements for Bibliographic Records (FRBR)
  • 70. FRBR Group 1 Entities Work Expression 1..∞ isRealisedThrough Manifestation isEmbodiedIn ∞ ..∞ Copy isExemplifiedBy 1..∞
  • 71. Work-Work Relationships Successor, Supplement, Adaptation etc Whole-Part Expression-Expression Relationships Abridgement, Revision, Translation etc Whole-Part Manifestation-Manifestation Relationships Reproduction, Alternate Whole-Part Item-Item Relationships Reconfiguration, Reproduction Whole-Part Functional Requirements for Bibliographic Records (FRBR)
  • 72. Group 2 Entities: Person, Corporate body Responsibility relationships Work is-Created-By Person/CB Expression is-Realised-By Person/CB Manifestation is-Produced-By Person/CB Item is-Owned-By Person/CB Group 3 Entities: Concept, Object, Event and Place Subject relationships Work has-as-Subject Work/Expression/Manifestation/Item Work has-as-Subject Person/CB Work has-as-Subject Concept/Object/Event/Place Functional Requirements for Bibliographic Records (FRBR)
  • 73. The SWAP Domain Model: Entities & Relationships ScholarlyWork SWAP adopts/extends a subset of the FRBR model Expression 0..∞ isExpressedAs Manifestation isManifestedAs 0..∞ Copy isAvailableAs 0..∞ 0..∞ 0..∞ isCreatedBy isPublishedBy 0..∞ isEditedBy 0..∞ isFundedBy isSupervisedBy AffiliatedInstitution Agent
  • 74. The SWAP Domain Model: Attributes ScholarlyWork: title subject abstract affiliated institution identifier Agent: name type of agent date of birth mailbox homepage identifier Expression: title date available status version number language genre / type copyright holder bibliographic citation identifier Manifestation: format date modified Copy: date available access rights licence identifier
  • 75. The SWAP Domain Model Expression isExpressedAs Expression isExpressedAs Manifestation Manifestation isManifestedAs isManifestedAs hasAdaptation ScholarlyWork hasVersion hasTranslation Copy isAvailableAs Copy isAvailableAs Copy isAvailableAs
  • 76. Aggregations! http://arxiv.org/abs/astro-ph/0611775 This is only a tentative analysis! Manifestations/Items Other Expressions of same Work Expression Identifiers Relationships (Description of?) Expression?
  • 77. The SWAP Description Set Profile Specifies “description templates” for descriptions of instances of the five entity types Scholarly Work, Expression, Manifestation, Copy (Item), Agent Specifies “statement templates” for statements within those descriptions e.g. a description of a Scholarly Work must contain a statement using the dc:title property with a literal value e.g. when a description of a Scholarly Work contains a statement using the eprint:isExpressedAs property, a description of the value must follow the constraints of the “Expression” description template Small set of required data, much is optional
  • 78. Thoughts on the Approach Clarity on “what is being described” Easier to rationalise ‘traditional’ and ‘modern’ citations traditional citations tend to refer to ‘expressions’ hypertext links tend to refer to ‘copies’/‘items’ Clarity on relationships between resources facilitates managing “versions” Relatively complex underlying model may be manifest in relatively simple cataloguer and/or end-user interfaces FRBR applicable to other “intellectual creations” Potential to merge with other data FRBR as key part of RDA initiative to update library cataloguing standards
  • 79. Thoughts on the Approach However , limited implementation of SWAP to date (in JISC context) Inherent complexity? Fitting in to workflow? Compatibility with existing internal repository software models? See e.g. Jenny Delasalle (Warwick University), “SWAP and e-prints structures don’t match” http://blogs.warwick.ac.uk/wrap/entry/swap_and_e-prints/ eprints.org model of “eprint” + “document” v SWAP Work, Expression, Manifestation & Item
  • 80. Applications in e-Science domain Applicable to scientific eprints! SWAP was scoped to focus on eprints FRBR model more broadly applicable Current JISC-funded work on DCAPs for Geo-Spatial resources Still Images Time-Based Media Learning Materials (scoping study) Preliminary investigations on datasets Relationship of FRBR & aggregation models (ORE?) to description of datasets
  • 82. Some concluding thoughts Specifies set of properties to be used Requires core set of properties, allows for others Vocabularies Less explicit, doesn’t include the “describes” relationship, but resource-centric Explicitly aligns with Web Arch/httpRange-14 etc Deployment DCAM, layered on RDF RDF (with constraints) Description model Subset of FRBR model (several different relationship types); SWAP doesn’t address aggregation Aggregations (ore:aggregates) Domain model Scoped to specific resource type (scholarly works) (though FRBR broader) General in scope (though roots in scholarly communication domain) Scope Both seek to address problems of ambiguity of identification, explicitness of relationship types in (Semantic) Web-friendly ways Nature SWAP ORE
  • 83. Acknowledgements The sections on Web Architecture and OAI ORE make use of material from a presentation by Herbert Van de Sompel, “An Introduction to the ORE interoperability framework”, presented at the 4 th Search/Find Workshop, Ghent, Belgium, 22 August 2008 The section on SWAP adapts material from a presentation by Julie Allinson, “A Dublin Core Application Profile for Scholarly Works”, presented to the JISC CETIS Metadata & Digital Repositories SIG, Manchester, UK, 16 April 2007 Title slide photo “Iron ore flakes ” by Flickr user lars hammar See http://www.flickr.com/photos/7603557@N08/1195127627/ Made available under CC Attribution- NonCommercial -Share-Alike 2.0 license
  • 84. ORE & SWAP: Composition & Complexity Metadata Infrastructures, MPG eScience Seminar, Berlin