SlideShare a Scribd company logo
A Closer Look at the Changing Dynamics of
DBpedia Mappings
Dr.-Ing. Maribel Acosta
14th DBpedia Community Meeting, Karlsruhe, Germany
Motivation: Collaborative Crowdsourcing
2
F. Flöck, M. Acosta:
WikiWho: precise and efficient attribution of authorship of revisioned content. WWW 2014.
Motivation: Collaborative Crowdsourcing
3
https://www.wikiwho.net
German Wikipedia article à Autoren (bottom of the page) à wikiwho
Approach
4
5
Data Collection Pipeline
Get Mapping
Namespaces
Get Mapping
Id per
Namespace
Get Mapping
Metadata
6
http://mappings.dbpedia.org/api.php
{'action': 'query',
'meta': 'siteinfo',
'siprop': 'namespaces'}
{'action': 'query',
'list': 'allpages',
'apnamespace': %ns%,
'apcontinue' : %cont%}
{'action': 'query',
'pageids‘: %pageid%,
'prop': 'revisions',
'rvlimit': 'max',
'continue': %cont%,
'rvprop': 'timestamp|user|userid'}
Overview of the Results
Total number of mappings: 6,374
Total number of edits: 28,664
Total number of contributors: 283
7
As of 11. August 2019
Mappings per DBpedia Chapter
• Coverage of DBpedia mappings with respect to Wikipedia properties
8
Mappings per DBpedia Chapter
9
• Chapters with high number of mappings (>400): English, Dutch, Serbian
1
1
1
1
Edits over Time
• Freshness of the Wiki
• Maintenance effort
10
Edits over Time: DBpedia Chapters (en, de)
11
• (en) Most of the edits were performed when the chapter was created
• (de) Peaks of edits in 2011, 2014, 2015 (mostly about places)
1
2 2 2
1
2
Edits over Time: All DBpedia Chapters
12
1 1
1
2
1
• Most of the edits were performed when the chapter was created
• Edits decreased considerably (in all chapters). Convergence?
1
2
Edits per DBpedia Mapping
• Correctness / quality of individual mappings (?)
13
Edits per DBpedia Mapping
14
Mapping en:Infobox officeholder
Mapping ja:声優
Mapping el:Κουτί πληροφοριών εκλογών
Mapping sv:Geobox
• Ukranian Chapter presents the highest median (over 10 edits per mapping)
• 75th Percentile (Q3) is lower than 10 in most chapters
1
1
Contributors per DBpedia Chapter
• Completeness of mappings
• Mapping coverage of DBpedia chapters
15
Contributors per DBpedia Chapter
16
1
1
1
• Chapters with high number of contributors: English, Portuguese, German, Greek
• No significant correlation between # contributors and # mappings (p=-0.08, p-value=0.56)
1
1
Co-editing Network of DBpedia
17
• Editors contributing to the same mappings
• Editors across several DBpedia Chapters
18
PT
EL
EN
ES
JA
NL
FR
Outlook
19
Future Work
• Analyze the edits of the ontology namespaces and compare them with
the mapping edits
• Incorporating into the analysis other collaborative platforms used in the
development of DBpedia, e.g., GitHub
• Identify the type of edit actions – content creation, deletion, correction –
in DBpedia mappings (WikiWho)
• Create the interaction network to understand the collaborative process of
creating the DBpedia mappings (WhoVis)
20
21
WhoVis
Fabian Flöck, Maribel Acosta: whoVIS: Visualizing Editor Interactions and Dynamics in Collaborative Writing
Over Time. WWW (Companion Volume) 2015
Data & Results
22
https://github.com/maribelacosta/dbpedia-analysis
A Closer Look at the Changing Dynamics of
DBpedia Mappings

More Related Content

PPTX
Crowdsourcing the Quality of Knowledge Graphs: A DBpedia Study
Maribel Acosta Deibe
 
PPTX
Exploratory Data Analysis
thinrhino
 
PDF
Collaboratively Conceived, Designed and Implemented: Matching Visualization ...
Nancy Hoebelheinrich
 
PDF
DBpedia Mappings Quality Assessment
andimou
 
PPTX
Bridging Batch and Real-time Systems for Anomaly Detection
DataWorks Summit
 
PDF
Newspapers, IIIF, and ALTO
Karen Estlund
 
PDF
Doing data science with F#
Tomas Petricek
 
PDF
Doing data science with F# (BuildStuff)
Tomas Petricek
 
Crowdsourcing the Quality of Knowledge Graphs: A DBpedia Study
Maribel Acosta Deibe
 
Exploratory Data Analysis
thinrhino
 
Collaboratively Conceived, Designed and Implemented: Matching Visualization ...
Nancy Hoebelheinrich
 
DBpedia Mappings Quality Assessment
andimou
 
Bridging Batch and Real-time Systems for Anomaly Detection
DataWorks Summit
 
Newspapers, IIIF, and ALTO
Karen Estlund
 
Doing data science with F#
Tomas Petricek
 
Doing data science with F# (BuildStuff)
Tomas Petricek
 

What's hot (20)

PDF
opening up japanese resources 4 linked data cloud
eveline wandl-vogt
 
PPTX
F# Data: Making structured data first class citizens
Tomas Petricek
 
PPT
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State a...
Ralf Stockmann
 
PPTX
Information-rich programming in F# (ML Workshop 2012)
Tomas Petricek
 
PPTX
How the Web can change social science research (including yours)
Frank van Harmelen
 
PDF
AjayBhullar_Resume (5)
Ajay Bhullar
 
PPTX
We Have Interesting Problems: Some Applied Grand Challenges from Digital Libr...
Trevor Owens
 
PPTX
The agINFRA Linked Data layer by Valeria Pesce, Giovanni l'Abate, Luca Mattei...
CIARD Movement
 
PPT
Friday talk 11.02.2011
Jürgen Umbrich
 
ODP
Learn about Your Location (Using ALL Your Data)
Commonwealth Computer Research, Inc.
 
PPTX
Sharing data
Edmund Chamberlain
 
PDF
Data analytics courses
priyankaravilla
 
PPT
Scalable Data Analysis in R -- Lee Edlefsen
Revolution Analytics
 
PPTX
Context Semantic Analysis: a knowledge-based technique for computing inter-do...
Fabio Benedetti
 
PDF
R Basics and Best Practices
Kristen Sauby
 
PPTX
Data science courses
priyankaravilla
 
PPTX
Data Trajectories: tracking the reuse of published data for transitive credi...
Paolo Missier
 
PPTX
The Empirical Turn in Knowledge Representation
Frank van Harmelen
 
PDF
Standing-off Trees and Graphs : on the affordance of technologies for the edi...
Georg Vogeler
 
PDF
MULDER: Querying the Linked Data Web by Bridging RDF Molecule Templates
Kemele M. Endris
 
opening up japanese resources 4 linked data cloud
eveline wandl-vogt
 
F# Data: Making structured data first class citizens
Tomas Petricek
 
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen State a...
Ralf Stockmann
 
Information-rich programming in F# (ML Workshop 2012)
Tomas Petricek
 
How the Web can change social science research (including yours)
Frank van Harmelen
 
AjayBhullar_Resume (5)
Ajay Bhullar
 
We Have Interesting Problems: Some Applied Grand Challenges from Digital Libr...
Trevor Owens
 
The agINFRA Linked Data layer by Valeria Pesce, Giovanni l'Abate, Luca Mattei...
CIARD Movement
 
Friday talk 11.02.2011
Jürgen Umbrich
 
Learn about Your Location (Using ALL Your Data)
Commonwealth Computer Research, Inc.
 
Sharing data
Edmund Chamberlain
 
Data analytics courses
priyankaravilla
 
Scalable Data Analysis in R -- Lee Edlefsen
Revolution Analytics
 
Context Semantic Analysis: a knowledge-based technique for computing inter-do...
Fabio Benedetti
 
R Basics and Best Practices
Kristen Sauby
 
Data science courses
priyankaravilla
 
Data Trajectories: tracking the reuse of published data for transitive credi...
Paolo Missier
 
The Empirical Turn in Knowledge Representation
Frank van Harmelen
 
Standing-off Trees and Graphs : on the affordance of technologies for the edi...
Georg Vogeler
 
MULDER: Querying the Linked Data Web by Bridging RDF Molecule Templates
Kemele M. Endris
 
Ad

Similar to A Closer Look at the Changing Dynamics of DBpedia Mappings (20)

PDF
Decentralized Data Management for the Semantic Web
hala Skaf
 
PDF
The web of interlinked data and knowledge stripped
Sören Auer
 
PPTX
Multilingualism ifla 2014 08
Janifer Gatenby
 
PPT
SubSift web services and workflows for profiling and comparing scientists and...
Simon Price
 
PPTX
‘Facilitating User Engagement by Enriching Library Data using Semantic Techno...
CONUL Conference
 
PDF
Tapping into Scientific Data with Hadoop and Flink
Michael Häusler
 
PPT
Bne impact iif
IMPACT Centre of Competence
 
PPT
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation Framework
IMPACT Centre of Competence
 
PPTX
RethinkDB - the open-source database for the realtime web
Alex Ivanov
 
PDF
An Introduction to Apache Hadoop, Mahout and HBase
Lukas Vlcek
 
PDF
Converting GHO to RDF
Amrapali Zaveri, PhD
 
PDF
Data integration with a façade. The case of knowledge graph construction.
Enrico Daga
 
PDF
Spotlight
Stefano Lariccia
 
PDF
PDX Hadoop: Enterprise Data Workflows with Cascading and Mesos
Paco Nathan
 
PPTX
A Comparative Kalendar - DH2013 Presentation
blalbritton
 
PDF
20110728 datalift-rpi-troy
François Scharffe
 
PPTX
Connecting Stream Reasoners on the Web
Jean-Paul Calbimonte
 
PDF
Approach to leverage Websites to APIs through Semantics
Ioannis Stavrakantonakis
 
PPTX
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
MongoDB
 
Decentralized Data Management for the Semantic Web
hala Skaf
 
The web of interlinked data and knowledge stripped
Sören Auer
 
Multilingualism ifla 2014 08
Janifer Gatenby
 
SubSift web services and workflows for profiling and comparing scientists and...
Simon Price
 
‘Facilitating User Engagement by Enriching Library Data using Semantic Techno...
CONUL Conference
 
Tapping into Scientific Data with Hadoop and Flink
Michael Häusler
 
BL Demo Day - July2011 - (9) IMPACT Interoperability and Evaluation Framework
IMPACT Centre of Competence
 
RethinkDB - the open-source database for the realtime web
Alex Ivanov
 
An Introduction to Apache Hadoop, Mahout and HBase
Lukas Vlcek
 
Converting GHO to RDF
Amrapali Zaveri, PhD
 
Data integration with a façade. The case of knowledge graph construction.
Enrico Daga
 
Spotlight
Stefano Lariccia
 
PDX Hadoop: Enterprise Data Workflows with Cascading and Mesos
Paco Nathan
 
A Comparative Kalendar - DH2013 Presentation
blalbritton
 
20110728 datalift-rpi-troy
François Scharffe
 
Connecting Stream Reasoners on the Web
Jean-Paul Calbimonte
 
Approach to leverage Websites to APIs through Semantics
Ioannis Stavrakantonakis
 
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
MongoDB
 
Ad

More from Maribel Acosta Deibe (7)

PDF
HARE: An Engine for Enhancing Answer Completeness of SPARQL Queries via Crowd...
Maribel Acosta Deibe
 
PDF
Diefficiency Metrics: Measuring the Continuous Efficiency of Query Processing...
Maribel Acosta Deibe
 
PDF
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Maribel Acosta Deibe
 
PDF
HARE: A Hybrid SPARQL Engine to Enhance Query Answers via Crowdsourcing
Maribel Acosta Deibe
 
PPTX
Semantic Data Management in Graph Databases: ESWC 2014 Tutorial
Maribel Acosta Deibe
 
PPTX
Crowdsourcing Linked Data Quality Assessment
Maribel Acosta Deibe
 
PPTX
Semantic Data Management in Graph Databases
Maribel Acosta Deibe
 
HARE: An Engine for Enhancing Answer Completeness of SPARQL Queries via Crowd...
Maribel Acosta Deibe
 
Diefficiency Metrics: Measuring the Continuous Efficiency of Query Processing...
Maribel Acosta Deibe
 
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Maribel Acosta Deibe
 
HARE: A Hybrid SPARQL Engine to Enhance Query Answers via Crowdsourcing
Maribel Acosta Deibe
 
Semantic Data Management in Graph Databases: ESWC 2014 Tutorial
Maribel Acosta Deibe
 
Crowdsourcing Linked Data Quality Assessment
Maribel Acosta Deibe
 
Semantic Data Management in Graph Databases
Maribel Acosta Deibe
 

Recently uploaded (20)

PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PPTX
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PPTX
The Future of AI & Machine Learning.pptx
pritsen4700
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PDF
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
PDF
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
The Future of Artificial Intelligence (AI)
Mukul
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
The Future of AI & Machine Learning.pptx
pritsen4700
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
Simple and concise overview about Quantum computing..pptx
mughal641
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 

A Closer Look at the Changing Dynamics of DBpedia Mappings

  • 1. A Closer Look at the Changing Dynamics of DBpedia Mappings Dr.-Ing. Maribel Acosta 14th DBpedia Community Meeting, Karlsruhe, Germany
  • 2. Motivation: Collaborative Crowdsourcing 2 F. Flöck, M. Acosta: WikiWho: precise and efficient attribution of authorship of revisioned content. WWW 2014.
  • 5. 5
  • 6. Data Collection Pipeline Get Mapping Namespaces Get Mapping Id per Namespace Get Mapping Metadata 6 http://mappings.dbpedia.org/api.php {'action': 'query', 'meta': 'siteinfo', 'siprop': 'namespaces'} {'action': 'query', 'list': 'allpages', 'apnamespace': %ns%, 'apcontinue' : %cont%} {'action': 'query', 'pageids‘: %pageid%, 'prop': 'revisions', 'rvlimit': 'max', 'continue': %cont%, 'rvprop': 'timestamp|user|userid'}
  • 7. Overview of the Results Total number of mappings: 6,374 Total number of edits: 28,664 Total number of contributors: 283 7 As of 11. August 2019
  • 8. Mappings per DBpedia Chapter • Coverage of DBpedia mappings with respect to Wikipedia properties 8
  • 9. Mappings per DBpedia Chapter 9 • Chapters with high number of mappings (>400): English, Dutch, Serbian 1 1 1 1
  • 10. Edits over Time • Freshness of the Wiki • Maintenance effort 10
  • 11. Edits over Time: DBpedia Chapters (en, de) 11 • (en) Most of the edits were performed when the chapter was created • (de) Peaks of edits in 2011, 2014, 2015 (mostly about places) 1 2 2 2 1 2
  • 12. Edits over Time: All DBpedia Chapters 12 1 1 1 2 1 • Most of the edits were performed when the chapter was created • Edits decreased considerably (in all chapters). Convergence? 1 2
  • 13. Edits per DBpedia Mapping • Correctness / quality of individual mappings (?) 13
  • 14. Edits per DBpedia Mapping 14 Mapping en:Infobox officeholder Mapping ja:声優 Mapping el:Κουτί πληροφοριών εκλογών Mapping sv:Geobox • Ukranian Chapter presents the highest median (over 10 edits per mapping) • 75th Percentile (Q3) is lower than 10 in most chapters 1 1
  • 15. Contributors per DBpedia Chapter • Completeness of mappings • Mapping coverage of DBpedia chapters 15
  • 16. Contributors per DBpedia Chapter 16 1 1 1 • Chapters with high number of contributors: English, Portuguese, German, Greek • No significant correlation between # contributors and # mappings (p=-0.08, p-value=0.56) 1 1
  • 17. Co-editing Network of DBpedia 17 • Editors contributing to the same mappings • Editors across several DBpedia Chapters
  • 20. Future Work • Analyze the edits of the ontology namespaces and compare them with the mapping edits • Incorporating into the analysis other collaborative platforms used in the development of DBpedia, e.g., GitHub • Identify the type of edit actions – content creation, deletion, correction – in DBpedia mappings (WikiWho) • Create the interaction network to understand the collaborative process of creating the DBpedia mappings (WhoVis) 20
  • 21. 21 WhoVis Fabian Flöck, Maribel Acosta: whoVIS: Visualizing Editor Interactions and Dynamics in Collaborative Writing Over Time. WWW (Companion Volume) 2015
  • 23. A Closer Look at the Changing Dynamics of DBpedia Mappings