A Semantic Knowledge Graph at
National Library Board Singapore
Richard Wallis
Evangelist and Founder
Data Liberate
richard.wallis@dataliberate.com
2024 LD4 Conference
7th October 2024 - Online
Independent Consultant, Evangelist & Founder
W3C Community Groups:
• Bibframe2Schema (Chair) – Standardised conversion path(s)
• Schema Bib Extend (Chair) - Bibliographic data
• Schema Architypes (Chair) - Archives
• Financial Industry Business Ontology – Financial schema.org
• Tourism Structured Web Data (Co-Chair)
• Schema Course Extension
• Schema IoT Community
• Educational & Occupational Credentials in Schema.org
richard.wallis@dataliberate.com — @dataliberate
40+ Years – Computing
30+ Years – Cultural Heritage technology
20+ Years – Semantic Web & Linked Data
Worked With:
• Google – Schema.org vocabulary, site, extensions. documentation and community
• OCLC – Global library cooperative
• FIBO – Financial Industry Business Ontology Group
• Various Clients – Implementing/understanding Linked Data, Schema.org:
National Library Board Singapore
British Library — Stanford University — Europeana
2
3
Agenda for today
4
Agenda for today
• National Library and their resources
5
Agenda for today
• National Library and their resources
• Knowledge Graph ambition
6
Agenda for today
• National Library and their resources
• Knowledge Graph ambition
• Linked Data Management System – the LDMS delivered
7
Agenda for today
• National Library and their resources
• Knowledge Graph ambition
• Linked Data Management System – the LDMS delivered
• Continued development
8
Agenda for today
• National Library and their resources
• Knowledge Graph ambition
• Linked Data Management System – the LDMS delivered
• Continued development
– Data sharing with the Entity Data Service
9
Agenda for today
• National Library and their resources
• Knowledge Graph ambition
• Linked Data Management System – the LDMS delivered
• Continued development
– Data sharing with the Entity Data Service
– User experience enrichment – the sidebar API
10
Agenda for today
• National Library and their resources
• Knowledge Graph ambition
• Linked Data Management System – the LDMS delivered
• Continued development
– Data sharing with the Entity Data Service
– User experience enrichment – the sidebar API
– Data quality enhancement utilizing external authorities
11
National Library Board Singapore
Public Libraries
Network of 28 Public Libraries,
including 2 partner libraries*
Reading Programmes and Initiatives
Programmes and Exhibitions
targeted at Singapore communities
*Partner libraries are libraries which are partner owned
and
funded but managed by NLB/NLB’s subsidiary Libraries
and Archives Solutions Pte Ltd. Library@Chinatown and
the Lifelong Learning Institute Library are Partner libraries.
National Archives
Transferred from NHB to NLB in Nov
2012
Custodian of Singapore’s
Collective Memory: Responsible
for Collection, Preservation and
Management of
Singapore’s Public and Private
Archival Records
Promotes Public Interest in our
Nation’s
History and Heritage
National Library
Preserving Singapore’s Print
and Literary Heritage, and
Intellectual memory
Reference Collections
Legal Deposit (including
electronic)
12
Over
560,000
Singapore &
SEA items
Over 147,000
Chinese, Malay &
Tamil Languages
items
Reference Collection
Over 62,000
Social Sciences
& Humanities
items
Over 39,000
Science &
Technology
items
Over
53,000
Arts items
Over 19,000
Rare Materials
items
Archival Materials
Over 290,000
Government files &
Parliament papers
Over 190,000
Audiovisual & sound
recordings
Over 70,000
Maps & building
plans
Over
1.14m
Photographs
Over 35,000
Oral history
interviews
Over 55,000
Speeches & press
releases
Over
7,000
Posters
National Library Board Singapore
Over 5m
print collection
Over 2.4m
music tracks
78
databases
Over 7,400
e-newspapers and
e-magazines titles
Over
8,000
e-learning
courses
Over 1.7m
e-books and
audio books
Lending Collection
13
National Library Board Online Services
15
The Ambition
• To enable the discovery & display of entitles from different sources
in a combined interface
16
The Ambition
• To enable the discovery & display of entitles from different sources
in a combined interface
• To bring together resources physical and digital
17
The Ambition
• To enable the discovery & display of entitles from different sources
in a combined interface
• To bring together resources physical and digital
• To bring together diverse systems across the National Library,
National Archives, and Public Libraries in a Linked Data Environment
18
The Ambition
• To enable the discovery & display of entitles from different sources
in a combined interface
• To bring together resources physical and digital
• To bring together diverse systems across the National Library,
National Archives, and Public Libraries in a Linked Data Environment
• To provide a staff interface to view and manage all entities, their
descriptions and relationships
19
The Ambition – Technical Challenges
• To produce a Knowledge Graph that is [daily] up to date
20
The Ambition – Technical Challenges
• To produce a Knowledge Graph that is [daily] up to date
• Not to replace current cataloging processes & practices
21
The Ambition – Technical Challenges
• To produce a Knowledge Graph that is [daily] up to date
• Not to replace current cataloging processes & practices
– Marc cataloguing in the ILS
22
The Ambition – Technical Challenges
• To produce a Knowledge Graph that is [daily] up to date
• Not to replace current cataloging processes & practices
– Marc cataloguing in the ILS
– TTE maintenance in authority control
23
The Ambition – Technical Challenges
• To produce a Knowledge Graph that is [daily] up to date
• Not to replace current cataloging processes & practices
– Marc cataloguing in the ILS
– TTE maintenance in authority control
– Dublin Core content management for CMS sites and Archives
24
The Ambition – Technical Challenges
• To produce a Knowledge Graph that is [daily] up to date
• Not to replace current cataloging processes & practices
– Marc cataloguing in the ILS
– TTE maintenance in authority control
– Dublin Core content management for CMS sites and Archives
• Data sharable with the world
– Linked Open Data
– Schema.org
25
The Ambition – Technical Challenges
• To produce a Knowledge Graph that is [daily] up to date
• Not to replace current cataloging processes & practices
– Marc cataloguing in the ILS
– TTE maintenance in authority control
– Dublin Core content management for CMS sites and Archives
• Data sharable with the world
– Linked Open Data
– Schema.org
• An aggregated source of truth
26
Contract Awarded
metaphactory platform
Low-code knowledge graph platform
Semantic knowledge modeling
Semantic search & discovery
AWS Partner
Public sector partner
Singapore based
Linked Data, Structured data, Semantic
Web, bibliographic meta data, Schema.org
and management systems consultant
27
Basic Data Model
• Linked Data
– BIBFRAME to capture detail of bibliographic records
– Schema.org to deliver structured data for search engines
28
Basic Data Model
• Linked Data
– BIBFRAME to capture detail of bibliographic records
– Schema.org to deliver structured data for search engines
– Schema.org representation of CMS, NAS, TTE data
29
Basic Data Model
• Linked Data
– BIBFRAME to capture detail of bibliographic records
– Schema.org to deliver structured data for search engines
– Schema.org representation of CMS, NAS, TTE data
– Schema.org enrichment of BIBFRAME
30
Basic Data Model
• Linked Data
– BIBFRAME to capture detail of bibliographic records
– Schema.org to deliver structured data for search engines
– Schema.org representation of CMS, NAS, TTE data
– Schema.org enrichment of BIBFRAME
• Schema.org as the ‘lingua franca’ vocabulary of the Knowledge graph
31
Basic Data Model
• Linked Data
– BIBFRAME to capture detail of bibliographic records
– Schema.org to deliver structured data for search engines
– Schema.org representation of CMS, NAS, TTE data
– Schema.org enrichment of BIBFRAME
• Schema.org as the ‘lingua franca’ vocabulary of the Knowledge graph
– All entities described using Schema.org as a minimum.
32
Data Data Data!
Data Source Source Records Entity Count Update Frequency
ILS 1.4m 7.9m Daily
CMS 82k 228k Weekly
NAS 1.6m 6.7m Monthly
TTE 3k 317k Monthly
3.1m 15.15m
33
Data Ingest Pipelines
• Triggered by data upload from source system
34
Data Ingest Pipelines
• Triggered by data upload from source system
• ILS – daily
35
Data Ingest Pipelines
• Triggered by data upload from source system
• ILS – daily
– MARC-XML parsed through Open Source scripts:
• Marc2bibframe2 – Library of Congress
• Bibframe2schema – Bibframe2Schema.org
36
Data Ingest Pipelines
• Triggered by data upload from source system
• ILS – daily
– MARC-XML parsed through Open Source scripts:
• Marc2bibframe2 – Library of Congress
• Bibframe2schema – Bibframe2Schema.org
• TTE Authorities – Monthly
37
Data Ingest Pipelines
• Triggered by data upload from source system
• ILS – daily
– MARC-XML parsed through Open Source scripts:
• Marc2bibframe2 – Library of Congress
• Bibframe2schema – Bibframe2Schema.org
• TTE Authorities – Monthly
– Bespoke CSV conversion
38
Data Ingest Pipelines
• Triggered by data upload from source system
• ILS – daily
– MARC-XML parsed through Open Source scripts:
• Marc2bibframe2 – Library of Congress
• Bibframe2schema – Bibframe2Schema.org
• TTE Authorities – Monthly
– Bespoke CSV conversion
• CMS & NAS – Weekly / Monthly
39
Data Ingest Pipelines
• Triggered by data upload from source system
• ILS – daily
– MARC-XML parsed through Open Source scripts:
• Marc2bibframe2 – Library of Congress
• Bibframe2schema – Bibframe2Schema.org
• TTE Authorities – Monthly
– Bespoke CSV conversion
• CMS & NAS – Weekly / Monthly
– Dublin Core to Schema.org
40
Technical Architecture (simplified)
Hosted on Amazon Web Services
Batch Scripts
import control
Etc.
SOURCE DATA
IMPORT
41
Technical Architecture (simplified)
Hosted on Amazon Web Services
Pipeline
processing
Batch Scripts
import control
Etc.
SOURCE DATA
IMPORT
42
Technical Architecture (simplified)
Hosted on Amazon Web Services
GraphDB
Cluster
GraphDB
Cluster
GraphDB
Cluster
GraphDB
Cluster
Pipeline
processing
Batch Scripts
import control
Etc.
SOURCE DATA
IMPORT
43
Technical Architecture (simplified)
Hosted on Amazon Web Services
EDS
GraphDB
Cluster
GraphDB
Cluster
GraphDB
Cluster
GraphDB
Cluster
Pipeline
processing
Batch Scripts
import control
Etc.
SOURCE DATA
IMPORT
44
Technical Architecture (simplified)
Hosted on Amazon Web Services
EDS
GraphDB
Cluster
GraphDB
Cluster
GraphDB
Cluster
GraphDB
Cluster
Pipeline
processing
Batch Scripts
import control
Etc.
SOURCE DATA
IMPORT
DMI
45
A need for entity reconciliation …..
• Lots (and lots and lots) of source entities – 10 million entities
46
A need for entity reconciliation …..
• Lots (and lots and lots) of source entities – 10 million entities
• Lots of duplication
– Lee, Kuan Yew – 1st Prime Minister of Singapore
• 160 individual entities in ILS source data
47
A need for entity reconciliation …..
• Lots (and lots and lots) of source entities – 10 million entities
• Lots of duplication
– Lee, Kuan Yew – 1st Prime Minister of Singapore
• 160 individual entities in ILS source data
– Singapore Art Museum
• Entities from source data
• 21 CMS, 1 NAS, 66 ILS, 1 TTE
48
A need for entity reconciliation …..
• Lots (and lots and lots) of source entities – 10 million entities
• Lots of duplication
– Lee, Kuan Yew – 1st Prime Minister of Singapore
• 160 individual entities in ILS source data
– Singapore Art Museum
• Entities from source data
• 21 CMS, 1 NAS, 66 ILS, 1 TTE
• Users only want 1 of each!
49
Adaptive Data Model Concepts
• Source entitles
– Individual representation of source data
50
Adaptive Data Model Concepts
• Source entitles
– Individual representation of source data
• Aggregation entities
– Tracking relationships between source entities for the same thing
– No copying of attributes
51
Adaptive Data Model Concepts
• Source entitles
– Individual representation of source data
• Aggregation entities
– Tracking relationships between source entities for the same thing
– No copying of attributes
• Primary Entities
– Searchable by users
– Displayable to users
– Consolidation of aggregated source data & managed attributes
LDMS Aggregation
Model
Library System
LDMS Aggregation
Model
Content
Library System
LDMS Aggregation
Model
Authority
Content
Library System
LDMS Aggregation
Model
Authority
Content
Library System
Singapore Art
Museum
Organization
LDMS Aggregation
Model
Authority
Content
Library System
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
LDMS Aggregation
Model
Authority
Content
Library System
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
ore:Aggregation
b8be-8df3f3ac3203
LDMS Aggregation
Model
Authority
Content
Library System
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
ore:Aggregation
b8be-8df3f3ac3203
ore:aggregates
LDMS Aggregation
Model
Authority
Content
Library System
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
ore:Aggregation
b8be-8df3f3ac3203
ore:aggregates
LDMS Aggregation
Model
Authority
Content
Library System
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
ore:Aggregation
b8be-8df3f3ac3203
ore:Aggregation
b8be-8df3f3ac3203
ore:aggregates ore:aggregates
LDMS Aggregation
Model
Authority
Content
Library System
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
ore:Aggregation
b8be-8df3f3ac3203
ore:Aggregation
b8be-8df3f3ac3203
ore:Aggregation
b8be-8df3f3ac3203
ore:aggregates ore:aggregates ore:aggregates
LDMS Aggregation
Model
Authority
Content
Library System
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
ore:Aggregation
b8be-8df3f3ac3203
ore:Aggregation
b8be-8df3f3ac3203
ore:Aggregation
b8be-8df3f3ac3203
Singapore Art
Museum
Organization
ore:aggregates ore:aggregates ore:aggregates
LDMS Aggregation
Model
Authority
Content
Library System
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
ore:Aggregation
b8be-8df3f3ac3203
ore:Aggregation
b8be-8df3f3ac3203
ore:Aggregation
b8be-8df3f3ac3203
Singapore Art
Museum
Organization
ore:aggregates ore:aggregates ore:aggregates
ore:isAggregatedBy
LDMS Aggregation
Model
Authority
Content
Library System
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
Singapore Art
Museum
Organization
ore:Aggregation
b8be-8df3f3ac3203
ore:Aggregation
b8be-8df3f3ac3203
ore:Aggregation
b8be-8df3f3ac3203
Singapore Art
Museum
Organization
ore:aggregates ore:aggregates ore:aggregates
ore:isAggregatedBy
LDMS Aggregation
Model
ore:isAggregatedBy
66
67
68
69
70
71
72
73
74
80
The entity iceberg
81
The entity iceberg
Primary
82
The entity iceberg
Primary
Discovery
83
The entity iceberg
Primary
Aggregation
Discovery
84
The entity iceberg
Primary
Aggregation
Source
Ingestion
Pipelines
Discovery
85
The entity iceberg
Primary
Aggregation
Source
Ingestion
Pipelines
Discovery
Management
86
The NLB Knowledge Graph
• 666M Triples
87
The NLB Knowledge Graph
• 666M Triples
• 10M Source Entities
88
The NLB Knowledge Graph
• 666M Triples
• 10M Source Entities
• 5.8M Primary Entities
89
The NLB Knowledge Graph
• 666M Triples
• 10M Source Entities
• 5.8M Primary Entities
– Aggregation of source derived entities
90
The NLB Knowledge Graph
• 666M Triples
• 10M Source Entities
• 5.8M Primary Entities
– Aggregation of source derived entities
– Searchable
91
The NLB Knowledge Graph
• 666M Triples
• 10M Source Entities
• 5.8M Primary Entities
– Aggregation of source derived entities
– Searchable
– Shared with world
92
The NLB Knowledge Graph
• 666M Triples
• 10M Source Entities
• 5.8M Primary Entities
– Aggregation of source derived entities
– Searchable
– Shared with world
93
NLB Linked Data Management System (LDMS)
• Powered by the Knowledge Graph
94
NLB Linked Data Management System (LDMS)
• Powered by the Knowledge Graph
• Updated daily
95
NLB Linked Data Management System (LDMS)
• Powered by the Knowledge Graph
• Updated daily
• A new separate environment built on established systems
96
NLB Linked Data Management System (LDMS)
• Powered by the Knowledge Graph
• Updated daily
• A new separate environment built on established systems
• No changes in cataloguing practices
97
NLB Linked Data Management System (LDMS)
• Powered by the Knowledge Graph
• Updated daily
• A new separate environment built on established systems
• No changes in cataloguing practices
• No cataloguer retraining
98
NLB Linked Data Management System (LDMS)
• Powered by the Knowledge Graph
• Updated daily
• A new separate environment built on established systems
• No changes in cataloguing practices
• No cataloguer retraining
• Not just the bibliographic (MARC) data
99
NLB Linked Data Management System (LDMS)
• Powered by the Knowledge Graph
• Updated daily
• A new separate environment built on established systems
• No changes in cataloguing practices
• No cataloguer retraining
• Not just the bibliographic (MARC) data
• No replacement systems – to implement Linked Data
100
NLB Linked Data Management System (LDMS)
• Powered by the Knowledge Graph
• Updated daily
• A new separate environment built on established systems
• No changes in cataloguing practices
• No cataloguer retraining
• Not just the bibliographic (MARC) data
• No replacement systems – to implement Linked Data
– MARC based ILS swap out occurred mid project – without LDMS impact
101
NLB Linked Data Management System (LDMS)
• Powered by the Knowledge Graph
• Updated daily
• A new separate environment built on established systems
• No changes in cataloguing practices
• No cataloguer retraining
• Not just the bibliographic (MARC) data
• No replacement systems – to implement Linked Data
– MARC based ILS swap out occurred mid project – without LDMS impact
• Delivering Linked Data benefits back into the organization
102
Building on the Knowledge Graph
Entity Data Service
• Open Linked Data interface
103
Building on the Knowledge Graph
Entity Data Service
• Open Linked Data interface
• Dereferencing entity URIs
104
Building on the Knowledge Graph
Entity Data Service
• Open Linked Data interface
• Dereferencing entity URIs
• Content negotiation for RDF/XML / JSON-LD / Turtle / N-Triples
105
Building on the Knowledge Graph
Entity Data Service
• Open Linked Data interface
• Dereferencing entity URIs
• Content negotiation for RDF/XML / JSON-LD / Turtle / N-Triples
• Download formats RDF/XML / JSON-LD / Turtle / N-Triples
106
Building on the Knowledge Graph
Entity Data Service
• Open Linked Data interface
• Dereferencing entity URIs
• Content negotiation for RDF/XML / JSON-LD / Turtle / N-Triples
• Download formats RDF/XML / JSON-LD / Turtle / N-Triples
• Embedded Schema.org
107
Building on the Knowledge Graph
Entity Data Service
• Open Linked Data interface
• Dereferencing entity URIs
• Content negotiation for RDF/XML / JSON-LD / Turtle / N-Triples
• Download formats RDF/XML / JSON-LD / Turtle / N-Triples
• Embedded Schema.org
• Enhanced navigation
111
Building on the Knowledge Graph
Enriching the User Journey
112
Building on the Knowledge Graph
Enriching the User Journey
• Systems are often silos
113
Building on the Knowledge Graph
Enriching the User Journey
• Systems are often silos
• User search and navigation constrained by their own data
114
Building on the Knowledge Graph
Enriching the User Journey
• Systems are often silos
• User search and navigation constrained by their own data
• Knowledge Graph populated from several individual systems
115
Building on the Knowledge Graph
Enriching the User Journey
• Systems are often silos
• User search and navigation constrained by their own data
• Knowledge Graph populated from several individual systems
• Entities aggregated and related across system sources
116
Building on the Knowledge Graph
Enriching the User Journey
• Systems are often silos
• User search and navigation constrained by their own data
• Knowledge Graph populated from several individual systems
• Entities aggregated and related across system sources
• The fuel to explore between systems
117
Building on the Knowledge Graph
Enriching the User Journey
• Systems are often silos
• User search and navigation constrained by their own data
• Knowledge Graph populated from several individual systems
• Entities aggregated and related across system sources
• The fuel to explore between systems
• Via a navigational interface sidebar
118
Building on the Knowledge Graph
Enriching the User Journey
• Systems are often silos
• User search and navigation constrained by their own data
• Knowledge Graph populated from several individual systems
• Entities aggregated and related across system sources
• The fuel to explore between systems
• Via a navigational interface sidebar
• Plugged into user interface
119
Building on the Knowledge Graph
Enriching the User Journey
• Systems are often silos
• User search and navigation constrained by their own data
• Knowledge Graph populated from several individual systems
• Entities aggregated and related across system sources
• The fuel to explore between systems
• Via a navigational interface sidebar
• Plugged into user interface
• Powered by a JavaScript Sidebar API
Use of the JavaScript Sidebar API
Use of the JavaScript Sidebar API
→ API call to KG – article ID passed as parameter
Use of the JavaScript Sidebar API
→ API call to KG – article ID passed as parameter
← Description of associated Primary entity returned
Description includes list of ‘about’ related entity IDs
used to build display and navigation links
Use of the JavaScript Sidebar API
→ API call to KG – article ID passed as parameter
← Description of associated Primary entity returned
Clicking sidebar links trigger new API calls to rebuild
the sidebar display as entity relationships are
followed
Description includes list of ‘about’ related entity IDs
used to build display and navigation links
Use of the JavaScript Sidebar API
→ API call to KG – article ID passed as parameter
← Description of associated Primary entity returned
Clicking sidebar links trigger new API calls to rebuild
the sidebar display as entity relationships are
followed
Knowledge Graph navigation via a sidebar
Description includes list of ‘about’ related entity IDs
used to build display and navigation links
Use of the JavaScript Sidebar API
→ API call to KG – article ID passed as parameter
← Description of associated Primary entity returned
Clicking sidebar links trigger new API calls to rebuild
the sidebar display as entity relationships are
followed
Knowledge Graph navigation via a sidebar
Description includes list of ‘about’ related entity IDs
used to build display and navigation links
Use of the JavaScript Sidebar API
→ API call to KG – article ID passed as parameter
← Description of associated Primary entity returned
129
KG Quality Enhancement from Authorities
LCNAF URI Ingestion
• For Person / Organization entities with LCNAF URIs
130
KG Quality Enhancement from Authorities
LCNAF URI Ingestion
• For Person / Organization entities with LCNAF URIs
• Created via the marc2bibframe2 scripts - from $0 subfield
131
KG Quality Enhancement from Authorities
LCNAF URI Ingestion
• For Person / Organization entities with LCNAF URIs
• Created via the marc2bibframe2 scripts - from $0 subfield
• Create rdfs:label values from the marc record
eg. 700$a + 700$d
132
KG Quality Enhancement from Authorities
LCNAF URI Ingestion
• For Person / Organization entities with LCNAF URIs
• Created via the marc2bibframe2 scripts - from $0 subfield
• Create rdfs:label values from the marc record
eg. 700$a + 700$d
• These values are not controlled – entity can have several different labels
133
KG Quality Enhancement from Authorities
LCNAF URI Ingestion
• For Person / Organization entities with LCNAF URIs
• Created via the marc2bibframe2 scripts - from $0 subfield
• Create rdfs:label values from the marc record
eg. 700$a + 700$d
• These values are not controlled – entity can have several different labels
• Use LCNAF authority data to introduce naming consistency
134
KG Quality Enhancement from Authorities
LCNAF URI Ingestion
• For Person / Organization entities with LCNAF URIs
• Created via the marc2bibframe2 scripts - from $0 subfield
• Create rdfs:label values from the marc record
eg. 700$a + 700$d
• These values are not controlled – entity can have several different labels
• Use LCNAF authority data to introduce naming consistency
• Lookup against LCNAF to identify & ingest authoritative version
135
KG Quality Enhancement from Authorities
LCNAF URI Ingestion
• For Person / Organization entities with LCNAF URIs
• Created via the marc2bibframe2 scripts - from $0 subfield
• Create rdfs:label values from the marc record
eg. 700$a + 700$d
• These values are not controlled – entity can have several different labels
• Use LCNAF authority data to introduce naming consistency
• Lookup against LCNAF to identify & ingest authoritative version
• LCNAF values take precedence in primary entity consolidation
Quality Enrichment from Authorities
LCNAF URI Ingestion
MARC XML:
Quality Enrichment from Authorities
LCNAF URI Ingestion
MARC XML:
Quality Enrichment from Authorities
LCNAF URI Ingestion
MARC XML:
Bibframe RDF:
Quality Enrichment from Authorities
LCNAF URI Ingestion
MARC XML:
Bibframe RDF:
Quality Enrichment from Authorities
LCNAF URI Ingestion
MARC XML:
Bibframe RDF:
MARC XML:
Quality Enrichment from Authorities
LCNAF URI Ingestion
MARC XML:
Bibframe RDF:
MARC XML:
Bibframe RDF:
Quality Enrichment from Authorities
LCNAF URI Ingestion
MARC XML:
Bibframe RDF:
MARC XML:
Bibframe RDF:
Entity result in Knowledge Graph
Which is correct?
Quality Enrichment from Authorities
LCNAF URI Ingestion
MARC XML:
Bibframe RDF:
MARC XML:
Bibframe RDF:
Entity result in Knowledge Graph
Which is correct?
Quality Enrichment from Authorities
LCNAF URI Ingestion
MARC XML:
Bibframe RDF:
MARC XML:
Bibframe RDF:
Entity result in Knowledge Graph
Which is correct?
Ingest from LCNAF and give precedence in consolidation
145
Quality Enhancement from Authorities
LCNAF Person & Organization Name Matching
• For all Person and Organization primary entities
146
Quality Enhancement from Authorities
LCNAF Person & Organization Name Matching
• For all Person and Organization primary entities
• Perform a string-matching LCNAF lookup for schema:name values
147
Quality Enhancement from Authorities
LCNAF Person & Organization Name Matching
• For all Person and Organization primary entities
• Perform a string-matching LCNAF lookup for schema:name values
• Automatic background process
148
Quality Enhancement from Authorities
LCNAF Person & Organization Name Matching
• For all Person and Organization primary entities
• Perform a string-matching LCNAF lookup for schema:name values
• Automatic background process
• If exact match
– Ingest LCNAF entity – takes precedence in consolidation
149
Quality Enhancement from Authorities
LCNAF Person & Organization Name Matching
• For all Person and Organization primary entities
• Perform a string-matching LCNAF lookup for schema:name values
• Automatic background process
• If exact match
– Ingest LCNAF entity – takes precedence in consolidation
• If close match
– Add to list of match candidates
150
Quality Enhancement from Authorities
LCNAF Person & Organization Name Matching
• For all Person and Organization primary entities
• Perform a string-matching LCNAF lookup for schema:name values
• Automatic background process
• If exact match
– Ingest LCNAF entity – takes precedence in consolidation
• If close match
– Add to list of match candidates
– [Human] curator either accepts as a match or not
Quality Enrichment from Authorities
LCNAF Person & Organization Name Matching
Quality Enrichment from Authorities
LCNAF Person & Organization Name Matching
153
• 2 years in development
NLB Linked Data Management System (LDMS)
154
• 2 years in development
• Live and operational for 1.5 years
NLB Linked Data Management System (LDMS)
155
• 2 years in development
• Live and operational for 1.5 years
• Built on a 666M triple Knowledge Graph
NLB Linked Data Management System (LDMS)
156
• 2 years in development
• Live and operational for 1.5 years
• Built on a 666M triple Knowledge Graph
• Automatically updated daily
NLB Linked Data Management System (LDMS)
157
• 2 years in development
• Live and operational for 1.5 years
• Built on a 666M triple Knowledge Graph
• Automatically updated daily
• Using Bibframe & Schema.org
NLB Linked Data Management System (LDMS)
158
• 2 years in development
• Live and operational for 1.5 years
• Built on a 666M triple Knowledge Graph
• Automatically updated daily
• Using Bibframe & Schema.org
• Built on – not replacing – established systems & practices
NLB Linked Data Management System (LDMS)
159
• 2 years in development
• Live and operational for 1.5 years
• Built on a 666M triple Knowledge Graph
• Automatically updated daily
• Using Bibframe & Schema.org
• Built on – not replacing – established systems & practices
• A Linked Data Service for NLB
NLB Linked Data Management System (LDMS)
160
• 2 years in development
• Live and operational for 1.5 years
• Built on a 666M triple Knowledge Graph
• Automatically updated daily
• Using Bibframe & Schema.org
• Built on – not replacing – established systems & practices
• A Linked Data Service for NLB
– Utilizing external authorities to enrich and standardize descriptions
NLB Linked Data Management System (LDMS)
161
• 2 years in development
• Live and operational for 1.5 years
• Built on a 666M triple Knowledge Graph
• Automatically updated daily
• Using Bibframe & Schema.org
• Built on – not replacing – established systems & practices
• A Linked Data Service for NLB
– Utilizing external authorities to enrich and standardize descriptions
– Part of Open Linked Data Cloud – via Entity Data Service
NLB Linked Data Management System (LDMS)
162
• 2 years in development
• Live and operational for 1.5 years
• Built on a 666M triple Knowledge Graph
• Automatically updated daily
• Using Bibframe & Schema.org
• Built on – not replacing – established systems & practices
• A Linked Data Service for NLB
– Utilizing external authorities to enrich and standardize descriptions
– Part of Open Linked Data Cloud – via Entity Data Service
– Enriching user journeys on non-linked data systems – via sidebar API
NLB Linked Data Management System (LDMS)
A Semantic Knowledge Graph at
National Library Board Singapore
2024 LD4 Conference
7th October 2024 - Online
Richard Wallis
Evangelist and Founder
Data Liberate
richard.wallis@dataliberate.com

Building a Semantic Knowledge Graph split.pdf

  • 1.
    A Semantic KnowledgeGraph at National Library Board Singapore Richard Wallis Evangelist and Founder Data Liberate [email protected] 2024 LD4 Conference 7th October 2024 - Online
  • 2.
    Independent Consultant, Evangelist& Founder W3C Community Groups: • Bibframe2Schema (Chair) – Standardised conversion path(s) • Schema Bib Extend (Chair) - Bibliographic data • Schema Architypes (Chair) - Archives • Financial Industry Business Ontology – Financial schema.org • Tourism Structured Web Data (Co-Chair) • Schema Course Extension • Schema IoT Community • Educational & Occupational Credentials in Schema.org [email protected] — @dataliberate 40+ Years – Computing 30+ Years – Cultural Heritage technology 20+ Years – Semantic Web & Linked Data Worked With: • Google – Schema.org vocabulary, site, extensions. documentation and community • OCLC – Global library cooperative • FIBO – Financial Industry Business Ontology Group • Various Clients – Implementing/understanding Linked Data, Schema.org: National Library Board Singapore British Library — Stanford University — Europeana 2
  • 3.
  • 4.
    4 Agenda for today •National Library and their resources
  • 5.
    5 Agenda for today •National Library and their resources • Knowledge Graph ambition
  • 6.
    6 Agenda for today •National Library and their resources • Knowledge Graph ambition • Linked Data Management System – the LDMS delivered
  • 7.
    7 Agenda for today •National Library and their resources • Knowledge Graph ambition • Linked Data Management System – the LDMS delivered • Continued development
  • 8.
    8 Agenda for today •National Library and their resources • Knowledge Graph ambition • Linked Data Management System – the LDMS delivered • Continued development – Data sharing with the Entity Data Service
  • 9.
    9 Agenda for today •National Library and their resources • Knowledge Graph ambition • Linked Data Management System – the LDMS delivered • Continued development – Data sharing with the Entity Data Service – User experience enrichment – the sidebar API
  • 10.
    10 Agenda for today •National Library and their resources • Knowledge Graph ambition • Linked Data Management System – the LDMS delivered • Continued development – Data sharing with the Entity Data Service – User experience enrichment – the sidebar API – Data quality enhancement utilizing external authorities
  • 11.
    11 National Library BoardSingapore Public Libraries Network of 28 Public Libraries, including 2 partner libraries* Reading Programmes and Initiatives Programmes and Exhibitions targeted at Singapore communities *Partner libraries are libraries which are partner owned and funded but managed by NLB/NLB’s subsidiary Libraries and Archives Solutions Pte Ltd. Library@Chinatown and the Lifelong Learning Institute Library are Partner libraries. National Archives Transferred from NHB to NLB in Nov 2012 Custodian of Singapore’s Collective Memory: Responsible for Collection, Preservation and Management of Singapore’s Public and Private Archival Records Promotes Public Interest in our Nation’s History and Heritage National Library Preserving Singapore’s Print and Literary Heritage, and Intellectual memory Reference Collections Legal Deposit (including electronic)
  • 12.
    12 Over 560,000 Singapore & SEA items Over147,000 Chinese, Malay & Tamil Languages items Reference Collection Over 62,000 Social Sciences & Humanities items Over 39,000 Science & Technology items Over 53,000 Arts items Over 19,000 Rare Materials items Archival Materials Over 290,000 Government files & Parliament papers Over 190,000 Audiovisual & sound recordings Over 70,000 Maps & building plans Over 1.14m Photographs Over 35,000 Oral history interviews Over 55,000 Speeches & press releases Over 7,000 Posters National Library Board Singapore Over 5m print collection Over 2.4m music tracks 78 databases Over 7,400 e-newspapers and e-magazines titles Over 8,000 e-learning courses Over 1.7m e-books and audio books Lending Collection
  • 13.
  • 14.
    15 The Ambition • Toenable the discovery & display of entitles from different sources in a combined interface
  • 15.
    16 The Ambition • Toenable the discovery & display of entitles from different sources in a combined interface • To bring together resources physical and digital
  • 16.
    17 The Ambition • Toenable the discovery & display of entitles from different sources in a combined interface • To bring together resources physical and digital • To bring together diverse systems across the National Library, National Archives, and Public Libraries in a Linked Data Environment
  • 17.
    18 The Ambition • Toenable the discovery & display of entitles from different sources in a combined interface • To bring together resources physical and digital • To bring together diverse systems across the National Library, National Archives, and Public Libraries in a Linked Data Environment • To provide a staff interface to view and manage all entities, their descriptions and relationships
  • 18.
    19 The Ambition –Technical Challenges • To produce a Knowledge Graph that is [daily] up to date
  • 19.
    20 The Ambition –Technical Challenges • To produce a Knowledge Graph that is [daily] up to date • Not to replace current cataloging processes & practices
  • 20.
    21 The Ambition –Technical Challenges • To produce a Knowledge Graph that is [daily] up to date • Not to replace current cataloging processes & practices – Marc cataloguing in the ILS
  • 21.
    22 The Ambition –Technical Challenges • To produce a Knowledge Graph that is [daily] up to date • Not to replace current cataloging processes & practices – Marc cataloguing in the ILS – TTE maintenance in authority control
  • 22.
    23 The Ambition –Technical Challenges • To produce a Knowledge Graph that is [daily] up to date • Not to replace current cataloging processes & practices – Marc cataloguing in the ILS – TTE maintenance in authority control – Dublin Core content management for CMS sites and Archives
  • 23.
    24 The Ambition –Technical Challenges • To produce a Knowledge Graph that is [daily] up to date • Not to replace current cataloging processes & practices – Marc cataloguing in the ILS – TTE maintenance in authority control – Dublin Core content management for CMS sites and Archives • Data sharable with the world – Linked Open Data – Schema.org
  • 24.
    25 The Ambition –Technical Challenges • To produce a Knowledge Graph that is [daily] up to date • Not to replace current cataloging processes & practices – Marc cataloguing in the ILS – TTE maintenance in authority control – Dublin Core content management for CMS sites and Archives • Data sharable with the world – Linked Open Data – Schema.org • An aggregated source of truth
  • 25.
    26 Contract Awarded metaphactory platform Low-codeknowledge graph platform Semantic knowledge modeling Semantic search & discovery AWS Partner Public sector partner Singapore based Linked Data, Structured data, Semantic Web, bibliographic meta data, Schema.org and management systems consultant
  • 26.
    27 Basic Data Model •Linked Data – BIBFRAME to capture detail of bibliographic records – Schema.org to deliver structured data for search engines
  • 27.
    28 Basic Data Model •Linked Data – BIBFRAME to capture detail of bibliographic records – Schema.org to deliver structured data for search engines – Schema.org representation of CMS, NAS, TTE data
  • 28.
    29 Basic Data Model •Linked Data – BIBFRAME to capture detail of bibliographic records – Schema.org to deliver structured data for search engines – Schema.org representation of CMS, NAS, TTE data – Schema.org enrichment of BIBFRAME
  • 29.
    30 Basic Data Model •Linked Data – BIBFRAME to capture detail of bibliographic records – Schema.org to deliver structured data for search engines – Schema.org representation of CMS, NAS, TTE data – Schema.org enrichment of BIBFRAME • Schema.org as the ‘lingua franca’ vocabulary of the Knowledge graph
  • 30.
    31 Basic Data Model •Linked Data – BIBFRAME to capture detail of bibliographic records – Schema.org to deliver structured data for search engines – Schema.org representation of CMS, NAS, TTE data – Schema.org enrichment of BIBFRAME • Schema.org as the ‘lingua franca’ vocabulary of the Knowledge graph – All entities described using Schema.org as a minimum.
  • 31.
    32 Data Data Data! DataSource Source Records Entity Count Update Frequency ILS 1.4m 7.9m Daily CMS 82k 228k Weekly NAS 1.6m 6.7m Monthly TTE 3k 317k Monthly 3.1m 15.15m
  • 32.
    33 Data Ingest Pipelines •Triggered by data upload from source system
  • 33.
    34 Data Ingest Pipelines •Triggered by data upload from source system • ILS – daily
  • 34.
    35 Data Ingest Pipelines •Triggered by data upload from source system • ILS – daily – MARC-XML parsed through Open Source scripts: • Marc2bibframe2 – Library of Congress • Bibframe2schema – Bibframe2Schema.org
  • 35.
    36 Data Ingest Pipelines •Triggered by data upload from source system • ILS – daily – MARC-XML parsed through Open Source scripts: • Marc2bibframe2 – Library of Congress • Bibframe2schema – Bibframe2Schema.org • TTE Authorities – Monthly
  • 36.
    37 Data Ingest Pipelines •Triggered by data upload from source system • ILS – daily – MARC-XML parsed through Open Source scripts: • Marc2bibframe2 – Library of Congress • Bibframe2schema – Bibframe2Schema.org • TTE Authorities – Monthly – Bespoke CSV conversion
  • 37.
    38 Data Ingest Pipelines •Triggered by data upload from source system • ILS – daily – MARC-XML parsed through Open Source scripts: • Marc2bibframe2 – Library of Congress • Bibframe2schema – Bibframe2Schema.org • TTE Authorities – Monthly – Bespoke CSV conversion • CMS & NAS – Weekly / Monthly
  • 38.
    39 Data Ingest Pipelines •Triggered by data upload from source system • ILS – daily – MARC-XML parsed through Open Source scripts: • Marc2bibframe2 – Library of Congress • Bibframe2schema – Bibframe2Schema.org • TTE Authorities – Monthly – Bespoke CSV conversion • CMS & NAS – Weekly / Monthly – Dublin Core to Schema.org
  • 39.
    40 Technical Architecture (simplified) Hostedon Amazon Web Services Batch Scripts import control Etc. SOURCE DATA IMPORT
  • 40.
    41 Technical Architecture (simplified) Hostedon Amazon Web Services Pipeline processing Batch Scripts import control Etc. SOURCE DATA IMPORT
  • 41.
    42 Technical Architecture (simplified) Hostedon Amazon Web Services GraphDB Cluster GraphDB Cluster GraphDB Cluster GraphDB Cluster Pipeline processing Batch Scripts import control Etc. SOURCE DATA IMPORT
  • 42.
    43 Technical Architecture (simplified) Hostedon Amazon Web Services EDS GraphDB Cluster GraphDB Cluster GraphDB Cluster GraphDB Cluster Pipeline processing Batch Scripts import control Etc. SOURCE DATA IMPORT
  • 43.
    44 Technical Architecture (simplified) Hostedon Amazon Web Services EDS GraphDB Cluster GraphDB Cluster GraphDB Cluster GraphDB Cluster Pipeline processing Batch Scripts import control Etc. SOURCE DATA IMPORT DMI
  • 44.
    45 A need forentity reconciliation ….. • Lots (and lots and lots) of source entities – 10 million entities
  • 45.
    46 A need forentity reconciliation ….. • Lots (and lots and lots) of source entities – 10 million entities • Lots of duplication – Lee, Kuan Yew – 1st Prime Minister of Singapore • 160 individual entities in ILS source data
  • 46.
    47 A need forentity reconciliation ….. • Lots (and lots and lots) of source entities – 10 million entities • Lots of duplication – Lee, Kuan Yew – 1st Prime Minister of Singapore • 160 individual entities in ILS source data – Singapore Art Museum • Entities from source data • 21 CMS, 1 NAS, 66 ILS, 1 TTE
  • 47.
    48 A need forentity reconciliation ….. • Lots (and lots and lots) of source entities – 10 million entities • Lots of duplication – Lee, Kuan Yew – 1st Prime Minister of Singapore • 160 individual entities in ILS source data – Singapore Art Museum • Entities from source data • 21 CMS, 1 NAS, 66 ILS, 1 TTE • Users only want 1 of each!
  • 48.
    49 Adaptive Data ModelConcepts • Source entitles – Individual representation of source data
  • 49.
    50 Adaptive Data ModelConcepts • Source entitles – Individual representation of source data • Aggregation entities – Tracking relationships between source entities for the same thing – No copying of attributes
  • 50.
    51 Adaptive Data ModelConcepts • Source entitles – Individual representation of source data • Aggregation entities – Tracking relationships between source entities for the same thing – No copying of attributes • Primary Entities – Searchable by users – Displayable to users – Consolidation of aggregated source data & managed attributes
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
    Authority Content Library System Singapore Art Museum Organization SingaporeArt Museum Organization Singapore Art Museum Organization LDMS Aggregation Model
  • 57.
    Authority Content Library System Singapore Art Museum Organization SingaporeArt Museum Organization Singapore Art Museum Organization ore:Aggregation b8be-8df3f3ac3203 LDMS Aggregation Model
  • 58.
    Authority Content Library System Singapore Art Museum Organization SingaporeArt Museum Organization Singapore Art Museum Organization ore:Aggregation b8be-8df3f3ac3203 ore:aggregates LDMS Aggregation Model
  • 59.
    Authority Content Library System Singapore Art Museum Organization SingaporeArt Museum Organization Singapore Art Museum Organization ore:Aggregation b8be-8df3f3ac3203 ore:aggregates LDMS Aggregation Model
  • 60.
    Authority Content Library System Singapore Art Museum Organization SingaporeArt Museum Organization Singapore Art Museum Organization Singapore Art Museum Organization ore:Aggregation b8be-8df3f3ac3203 ore:Aggregation b8be-8df3f3ac3203 ore:aggregates ore:aggregates LDMS Aggregation Model
  • 61.
    Authority Content Library System Singapore Art Museum Organization SingaporeArt Museum Organization Singapore Art Museum Organization Singapore Art Museum Organization Singapore Art Museum Organization ore:Aggregation b8be-8df3f3ac3203 ore:Aggregation b8be-8df3f3ac3203 ore:Aggregation b8be-8df3f3ac3203 ore:aggregates ore:aggregates ore:aggregates LDMS Aggregation Model
  • 62.
    Authority Content Library System Singapore Art Museum Organization SingaporeArt Museum Organization Singapore Art Museum Organization Singapore Art Museum Organization Singapore Art Museum Organization ore:Aggregation b8be-8df3f3ac3203 ore:Aggregation b8be-8df3f3ac3203 ore:Aggregation b8be-8df3f3ac3203 Singapore Art Museum Organization ore:aggregates ore:aggregates ore:aggregates LDMS Aggregation Model
  • 63.
    Authority Content Library System Singapore Art Museum Organization SingaporeArt Museum Organization Singapore Art Museum Organization Singapore Art Museum Organization Singapore Art Museum Organization ore:Aggregation b8be-8df3f3ac3203 ore:Aggregation b8be-8df3f3ac3203 ore:Aggregation b8be-8df3f3ac3203 Singapore Art Museum Organization ore:aggregates ore:aggregates ore:aggregates ore:isAggregatedBy LDMS Aggregation Model
  • 64.
    Authority Content Library System Singapore Art Museum Organization SingaporeArt Museum Organization Singapore Art Museum Organization Singapore Art Museum Organization Singapore Art Museum Organization ore:Aggregation b8be-8df3f3ac3203 ore:Aggregation b8be-8df3f3ac3203 ore:Aggregation b8be-8df3f3ac3203 Singapore Art Museum Organization ore:aggregates ore:aggregates ore:aggregates ore:isAggregatedBy LDMS Aggregation Model ore:isAggregatedBy
  • 65.
  • 66.
  • 67.
  • 68.
  • 69.
  • 70.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.
  • 79.
  • 80.
    86 The NLB KnowledgeGraph • 666M Triples
  • 81.
    87 The NLB KnowledgeGraph • 666M Triples • 10M Source Entities
  • 82.
    88 The NLB KnowledgeGraph • 666M Triples • 10M Source Entities • 5.8M Primary Entities
  • 83.
    89 The NLB KnowledgeGraph • 666M Triples • 10M Source Entities • 5.8M Primary Entities – Aggregation of source derived entities
  • 84.
    90 The NLB KnowledgeGraph • 666M Triples • 10M Source Entities • 5.8M Primary Entities – Aggregation of source derived entities – Searchable
  • 85.
    91 The NLB KnowledgeGraph • 666M Triples • 10M Source Entities • 5.8M Primary Entities – Aggregation of source derived entities – Searchable – Shared with world
  • 86.
    92 The NLB KnowledgeGraph • 666M Triples • 10M Source Entities • 5.8M Primary Entities – Aggregation of source derived entities – Searchable – Shared with world
  • 87.
    93 NLB Linked DataManagement System (LDMS) • Powered by the Knowledge Graph
  • 88.
    94 NLB Linked DataManagement System (LDMS) • Powered by the Knowledge Graph • Updated daily
  • 89.
    95 NLB Linked DataManagement System (LDMS) • Powered by the Knowledge Graph • Updated daily • A new separate environment built on established systems
  • 90.
    96 NLB Linked DataManagement System (LDMS) • Powered by the Knowledge Graph • Updated daily • A new separate environment built on established systems • No changes in cataloguing practices
  • 91.
    97 NLB Linked DataManagement System (LDMS) • Powered by the Knowledge Graph • Updated daily • A new separate environment built on established systems • No changes in cataloguing practices • No cataloguer retraining
  • 92.
    98 NLB Linked DataManagement System (LDMS) • Powered by the Knowledge Graph • Updated daily • A new separate environment built on established systems • No changes in cataloguing practices • No cataloguer retraining • Not just the bibliographic (MARC) data
  • 93.
    99 NLB Linked DataManagement System (LDMS) • Powered by the Knowledge Graph • Updated daily • A new separate environment built on established systems • No changes in cataloguing practices • No cataloguer retraining • Not just the bibliographic (MARC) data • No replacement systems – to implement Linked Data
  • 94.
    100 NLB Linked DataManagement System (LDMS) • Powered by the Knowledge Graph • Updated daily • A new separate environment built on established systems • No changes in cataloguing practices • No cataloguer retraining • Not just the bibliographic (MARC) data • No replacement systems – to implement Linked Data – MARC based ILS swap out occurred mid project – without LDMS impact
  • 95.
    101 NLB Linked DataManagement System (LDMS) • Powered by the Knowledge Graph • Updated daily • A new separate environment built on established systems • No changes in cataloguing practices • No cataloguer retraining • Not just the bibliographic (MARC) data • No replacement systems – to implement Linked Data – MARC based ILS swap out occurred mid project – without LDMS impact • Delivering Linked Data benefits back into the organization
  • 96.
    102 Building on theKnowledge Graph Entity Data Service • Open Linked Data interface
  • 97.
    103 Building on theKnowledge Graph Entity Data Service • Open Linked Data interface • Dereferencing entity URIs
  • 98.
    104 Building on theKnowledge Graph Entity Data Service • Open Linked Data interface • Dereferencing entity URIs • Content negotiation for RDF/XML / JSON-LD / Turtle / N-Triples
  • 99.
    105 Building on theKnowledge Graph Entity Data Service • Open Linked Data interface • Dereferencing entity URIs • Content negotiation for RDF/XML / JSON-LD / Turtle / N-Triples • Download formats RDF/XML / JSON-LD / Turtle / N-Triples
  • 100.
    106 Building on theKnowledge Graph Entity Data Service • Open Linked Data interface • Dereferencing entity URIs • Content negotiation for RDF/XML / JSON-LD / Turtle / N-Triples • Download formats RDF/XML / JSON-LD / Turtle / N-Triples • Embedded Schema.org
  • 101.
    107 Building on theKnowledge Graph Entity Data Service • Open Linked Data interface • Dereferencing entity URIs • Content negotiation for RDF/XML / JSON-LD / Turtle / N-Triples • Download formats RDF/XML / JSON-LD / Turtle / N-Triples • Embedded Schema.org • Enhanced navigation
  • 105.
    111 Building on theKnowledge Graph Enriching the User Journey
  • 106.
    112 Building on theKnowledge Graph Enriching the User Journey • Systems are often silos
  • 107.
    113 Building on theKnowledge Graph Enriching the User Journey • Systems are often silos • User search and navigation constrained by their own data
  • 108.
    114 Building on theKnowledge Graph Enriching the User Journey • Systems are often silos • User search and navigation constrained by their own data • Knowledge Graph populated from several individual systems
  • 109.
    115 Building on theKnowledge Graph Enriching the User Journey • Systems are often silos • User search and navigation constrained by their own data • Knowledge Graph populated from several individual systems • Entities aggregated and related across system sources
  • 110.
    116 Building on theKnowledge Graph Enriching the User Journey • Systems are often silos • User search and navigation constrained by their own data • Knowledge Graph populated from several individual systems • Entities aggregated and related across system sources • The fuel to explore between systems
  • 111.
    117 Building on theKnowledge Graph Enriching the User Journey • Systems are often silos • User search and navigation constrained by their own data • Knowledge Graph populated from several individual systems • Entities aggregated and related across system sources • The fuel to explore between systems • Via a navigational interface sidebar
  • 112.
    118 Building on theKnowledge Graph Enriching the User Journey • Systems are often silos • User search and navigation constrained by their own data • Knowledge Graph populated from several individual systems • Entities aggregated and related across system sources • The fuel to explore between systems • Via a navigational interface sidebar • Plugged into user interface
  • 113.
    119 Building on theKnowledge Graph Enriching the User Journey • Systems are often silos • User search and navigation constrained by their own data • Knowledge Graph populated from several individual systems • Entities aggregated and related across system sources • The fuel to explore between systems • Via a navigational interface sidebar • Plugged into user interface • Powered by a JavaScript Sidebar API
  • 116.
    Use of theJavaScript Sidebar API
  • 117.
    Use of theJavaScript Sidebar API → API call to KG – article ID passed as parameter
  • 118.
    Use of theJavaScript Sidebar API → API call to KG – article ID passed as parameter ← Description of associated Primary entity returned
  • 119.
    Description includes listof ‘about’ related entity IDs used to build display and navigation links Use of the JavaScript Sidebar API → API call to KG – article ID passed as parameter ← Description of associated Primary entity returned
  • 120.
    Clicking sidebar linkstrigger new API calls to rebuild the sidebar display as entity relationships are followed Description includes list of ‘about’ related entity IDs used to build display and navigation links Use of the JavaScript Sidebar API → API call to KG – article ID passed as parameter ← Description of associated Primary entity returned
  • 121.
    Clicking sidebar linkstrigger new API calls to rebuild the sidebar display as entity relationships are followed Knowledge Graph navigation via a sidebar Description includes list of ‘about’ related entity IDs used to build display and navigation links Use of the JavaScript Sidebar API → API call to KG – article ID passed as parameter ← Description of associated Primary entity returned
  • 122.
    Clicking sidebar linkstrigger new API calls to rebuild the sidebar display as entity relationships are followed Knowledge Graph navigation via a sidebar Description includes list of ‘about’ related entity IDs used to build display and navigation links Use of the JavaScript Sidebar API → API call to KG – article ID passed as parameter ← Description of associated Primary entity returned
  • 123.
    129 KG Quality Enhancementfrom Authorities LCNAF URI Ingestion • For Person / Organization entities with LCNAF URIs
  • 124.
    130 KG Quality Enhancementfrom Authorities LCNAF URI Ingestion • For Person / Organization entities with LCNAF URIs • Created via the marc2bibframe2 scripts - from $0 subfield
  • 125.
    131 KG Quality Enhancementfrom Authorities LCNAF URI Ingestion • For Person / Organization entities with LCNAF URIs • Created via the marc2bibframe2 scripts - from $0 subfield • Create rdfs:label values from the marc record eg. 700$a + 700$d
  • 126.
    132 KG Quality Enhancementfrom Authorities LCNAF URI Ingestion • For Person / Organization entities with LCNAF URIs • Created via the marc2bibframe2 scripts - from $0 subfield • Create rdfs:label values from the marc record eg. 700$a + 700$d • These values are not controlled – entity can have several different labels
  • 127.
    133 KG Quality Enhancementfrom Authorities LCNAF URI Ingestion • For Person / Organization entities with LCNAF URIs • Created via the marc2bibframe2 scripts - from $0 subfield • Create rdfs:label values from the marc record eg. 700$a + 700$d • These values are not controlled – entity can have several different labels • Use LCNAF authority data to introduce naming consistency
  • 128.
    134 KG Quality Enhancementfrom Authorities LCNAF URI Ingestion • For Person / Organization entities with LCNAF URIs • Created via the marc2bibframe2 scripts - from $0 subfield • Create rdfs:label values from the marc record eg. 700$a + 700$d • These values are not controlled – entity can have several different labels • Use LCNAF authority data to introduce naming consistency • Lookup against LCNAF to identify & ingest authoritative version
  • 129.
    135 KG Quality Enhancementfrom Authorities LCNAF URI Ingestion • For Person / Organization entities with LCNAF URIs • Created via the marc2bibframe2 scripts - from $0 subfield • Create rdfs:label values from the marc record eg. 700$a + 700$d • These values are not controlled – entity can have several different labels • Use LCNAF authority data to introduce naming consistency • Lookup against LCNAF to identify & ingest authoritative version • LCNAF values take precedence in primary entity consolidation
  • 130.
    Quality Enrichment fromAuthorities LCNAF URI Ingestion MARC XML:
  • 131.
    Quality Enrichment fromAuthorities LCNAF URI Ingestion MARC XML:
  • 132.
    Quality Enrichment fromAuthorities LCNAF URI Ingestion MARC XML: Bibframe RDF:
  • 133.
    Quality Enrichment fromAuthorities LCNAF URI Ingestion MARC XML: Bibframe RDF:
  • 134.
    Quality Enrichment fromAuthorities LCNAF URI Ingestion MARC XML: Bibframe RDF: MARC XML:
  • 135.
    Quality Enrichment fromAuthorities LCNAF URI Ingestion MARC XML: Bibframe RDF: MARC XML: Bibframe RDF:
  • 136.
    Quality Enrichment fromAuthorities LCNAF URI Ingestion MARC XML: Bibframe RDF: MARC XML: Bibframe RDF: Entity result in Knowledge Graph Which is correct?
  • 137.
    Quality Enrichment fromAuthorities LCNAF URI Ingestion MARC XML: Bibframe RDF: MARC XML: Bibframe RDF: Entity result in Knowledge Graph Which is correct?
  • 138.
    Quality Enrichment fromAuthorities LCNAF URI Ingestion MARC XML: Bibframe RDF: MARC XML: Bibframe RDF: Entity result in Knowledge Graph Which is correct? Ingest from LCNAF and give precedence in consolidation
  • 139.
    145 Quality Enhancement fromAuthorities LCNAF Person & Organization Name Matching • For all Person and Organization primary entities
  • 140.
    146 Quality Enhancement fromAuthorities LCNAF Person & Organization Name Matching • For all Person and Organization primary entities • Perform a string-matching LCNAF lookup for schema:name values
  • 141.
    147 Quality Enhancement fromAuthorities LCNAF Person & Organization Name Matching • For all Person and Organization primary entities • Perform a string-matching LCNAF lookup for schema:name values • Automatic background process
  • 142.
    148 Quality Enhancement fromAuthorities LCNAF Person & Organization Name Matching • For all Person and Organization primary entities • Perform a string-matching LCNAF lookup for schema:name values • Automatic background process • If exact match – Ingest LCNAF entity – takes precedence in consolidation
  • 143.
    149 Quality Enhancement fromAuthorities LCNAF Person & Organization Name Matching • For all Person and Organization primary entities • Perform a string-matching LCNAF lookup for schema:name values • Automatic background process • If exact match – Ingest LCNAF entity – takes precedence in consolidation • If close match – Add to list of match candidates
  • 144.
    150 Quality Enhancement fromAuthorities LCNAF Person & Organization Name Matching • For all Person and Organization primary entities • Perform a string-matching LCNAF lookup for schema:name values • Automatic background process • If exact match – Ingest LCNAF entity – takes precedence in consolidation • If close match – Add to list of match candidates – [Human] curator either accepts as a match or not
  • 145.
    Quality Enrichment fromAuthorities LCNAF Person & Organization Name Matching
  • 146.
    Quality Enrichment fromAuthorities LCNAF Person & Organization Name Matching
  • 147.
    153 • 2 yearsin development NLB Linked Data Management System (LDMS)
  • 148.
    154 • 2 yearsin development • Live and operational for 1.5 years NLB Linked Data Management System (LDMS)
  • 149.
    155 • 2 yearsin development • Live and operational for 1.5 years • Built on a 666M triple Knowledge Graph NLB Linked Data Management System (LDMS)
  • 150.
    156 • 2 yearsin development • Live and operational for 1.5 years • Built on a 666M triple Knowledge Graph • Automatically updated daily NLB Linked Data Management System (LDMS)
  • 151.
    157 • 2 yearsin development • Live and operational for 1.5 years • Built on a 666M triple Knowledge Graph • Automatically updated daily • Using Bibframe & Schema.org NLB Linked Data Management System (LDMS)
  • 152.
    158 • 2 yearsin development • Live and operational for 1.5 years • Built on a 666M triple Knowledge Graph • Automatically updated daily • Using Bibframe & Schema.org • Built on – not replacing – established systems & practices NLB Linked Data Management System (LDMS)
  • 153.
    159 • 2 yearsin development • Live and operational for 1.5 years • Built on a 666M triple Knowledge Graph • Automatically updated daily • Using Bibframe & Schema.org • Built on – not replacing – established systems & practices • A Linked Data Service for NLB NLB Linked Data Management System (LDMS)
  • 154.
    160 • 2 yearsin development • Live and operational for 1.5 years • Built on a 666M triple Knowledge Graph • Automatically updated daily • Using Bibframe & Schema.org • Built on – not replacing – established systems & practices • A Linked Data Service for NLB – Utilizing external authorities to enrich and standardize descriptions NLB Linked Data Management System (LDMS)
  • 155.
    161 • 2 yearsin development • Live and operational for 1.5 years • Built on a 666M triple Knowledge Graph • Automatically updated daily • Using Bibframe & Schema.org • Built on – not replacing – established systems & practices • A Linked Data Service for NLB – Utilizing external authorities to enrich and standardize descriptions – Part of Open Linked Data Cloud – via Entity Data Service NLB Linked Data Management System (LDMS)
  • 156.
    162 • 2 yearsin development • Live and operational for 1.5 years • Built on a 666M triple Knowledge Graph • Automatically updated daily • Using Bibframe & Schema.org • Built on – not replacing – established systems & practices • A Linked Data Service for NLB – Utilizing external authorities to enrich and standardize descriptions – Part of Open Linked Data Cloud – via Entity Data Service – Enriching user journeys on non-linked data systems – via sidebar API NLB Linked Data Management System (LDMS)
  • 157.
    A Semantic KnowledgeGraph at National Library Board Singapore 2024 LD4 Conference 7th October 2024 - Online Richard Wallis Evangelist and Founder Data Liberate [email protected]