SlideShare a Scribd company logo
ISWC 2020 Challenge
Organizers:
with
https://smart-task.github.io/
Question / Answer Type Classification
• A popular task in the field of question answering
• Question classification based on Wh-terms
• Who, What, When, Where, Which, Whom, Whose, Why, How many
• Answer type classification
• predict the type of the answer
• Existing answer type classifications (e.g., TREC QA) use coarse-grained types
• 6 types: PERSON, LOCATION, NUMERIC, ENTITY, DESCRIPTION, ABBREVIATION
• 50 subtypes: ENTITY -> animal, plant, product, sport, religion, event, food, currency
• More fine-grained classifications are possible with Semantic Web ontologies.
• DBpedia (~760 classes), Wikidata (~50K classes)
Li, Xin, and Dan Roth. "Learning question classifiers: the role of semantic information."
Natural Language Engineering 12.3 (2006): 229-249.
Knowledge Base Question Answering (KBQA)
• Given a natural language question, generate the SPARQL query to find the answer.
• Popular datasets for KBQA in the Semantic Web community
• Question Answering over Linked Data (QALD)
• http://qald.aksw.org/
• Largescale Complex Question Answering Dataset (LC-QuAD)
• http://lc-quad.sda.tech/
• Most KBQA systems use some kind of question / answer type predication system.
• No standard dataset to evaluate the component performance.
Which films did
Stanley Kubrick direct?
select ?film where {
?film dbo:director dbr:Stanley_Kubrick .
}
2001: A Space Odyssey
Spartacus
Fear and Desire
Paths of Glory
Lolita
….
Answers
SPARQL
SMART Dataset
• A dataset for answer type prediction task using DBpedia and Wikidata ontologies.
• Derived using KBQA datasets.
• Three main types of questions.
Boolean Questions
Question: Is Azerbaijan a member of
European Go Federation?
category: boolean
Question: Is Darth Vader Luke’s father?
category: boolean
Literal Questions
Question: How many people live in Poland?
Category: literal
Type: number
Question: When did Shakespeare die?
Category: literal
Type: date
Question: What is the birth name of Angela
Merkel?
Category: literal
Type: string
Resource Questions
Question: Who is the heaviest player of the Chicago Bulls?
Category: resource
Type: dbo:BasketballPlayer, dbo:Athlete, dbo:Person
Question: Give me video games published by EA?
Category: resource
Type: dbo:VideoGame, dbo:Software, dbo:Work
Question: Who wrote the song Hotel California?
Category: resource
Type: dbo:MusicalArtist, dbo:Artist, dbo:Person
Question: Where did John McCarthy got his PhD from?
Type: dbo:University, dbo:EducationalInstitution,
dbo:Organization
SMART Dataset - II
Dataset Questions
Training Set
Resource answers 9, 584
17, 571
Literal answers 5, 188
Boolean 2, 799
Test Set 4, 393
Total 21,964
Dataset Questions
Training Set
Resource answers 11, 683
19, 670
Literal answers 5, 188
Boolean 2, 799
Test Set 4,571
Total 24,241
Evaluation
• Systems can participate for either one or both datasets; each will
have separate leader board.
• Systems can be rule-based, unsupervised, supervised …
• For each test question, the systems should provide
• Category and a list of types
• Evaluation metric
• Lenient NDCG@5/10 with a Linear decay (as defined by Balog and Neumayer)
Balog, Krisztian, and Robert Neumayer. "Hierarchical target type identification for entity-oriented
queries." (ACM CIKM'12).
DCG(type_list) = 𝑖=0
𝑘 𝑟𝑒𝑙𝑒𝑣𝑎𝑛𝑐𝑒 𝑡𝑦𝑝𝑒_𝑝𝑟𝑒𝑑𝑖
𝑙𝑜𝑔2
( 𝑖 + 1)
Relevance(typepred) = 1 − 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 typepred
,
typegold
𝑀𝑎𝑥 𝑑𝑒𝑝𝑡ℎ
Timeline
Date Description
6th of May, 2020 Release of training sets.
10th of August, 2020 Release of the test sets.
17th of August, 2020 Submission of system output and system description.
31st of August, 2020 Publication of results and notification of acceptance for
presentation.
14 of September, 2020 Camera-ready submission.
2-6 of November, 2020 ISWC Challenge (virtual) at the ISWC Conference
Please visit for https://smart-task.github.io/ for more details.
Thank You!
• We are looking forward to your participation.
• Any issues related to dataset, please report at
• https://github.com/smart-task/smart-dataset/issues
• Please feel free to contact us with any questions/feedback:
• Nandana Mihindukulasooriya <nandana.m@ibm.com>
• Mohnish Dubey <dubey@cs.uni-bonn.de>

More Related Content

What's hot (7)

PDF
Deep neural networks for matching online social networking profiles
Traian Rebedea
 
PPT
Towards Linked Ontologies and Data on the Semantic Web
Jie Bao
 
PDF
From Linked Data to Semantic Applications
Andre Freitas
 
PPT
Machine Learning ICS 273A
butest
 
PPTX
Ontology development in protégé-آنتولوژی در پروتوغه
sadegh salehi
 
PPTX
Reflected Intelligence: Lucene/Solr as a self-learning data system
Trey Grainger
 
PPTX
Keyword-based Search and Exploration on Databases (SIGMOD 2011)
weiw_oz
 
Deep neural networks for matching online social networking profiles
Traian Rebedea
 
Towards Linked Ontologies and Data on the Semantic Web
Jie Bao
 
From Linked Data to Semantic Applications
Andre Freitas
 
Machine Learning ICS 273A
butest
 
Ontology development in protégé-آنتولوژی در پروتوغه
sadegh salehi
 
Reflected Intelligence: Lucene/Solr as a self-learning data system
Trey Grainger
 
Keyword-based Search and Exploration on Databases (SIGMOD 2011)
weiw_oz
 

Similar to ISWC 2020 - Semantic Answer Type Prediction (8)

PDF
Big, Open, Data and Semantics for Real-World Application Near You
Biplav Srivastava
 
PPTX
PPT 2.3.1.pptx_PPT 2.3.1.pptx_PPT 2.3.1.pptx
manavvm456
 
PDF
Lecture 6: Watson and the Social Web (2014), Chris Welty
Lora Aroyo
 
PDF
Question Answering - Application and Challenges
Jens Lehmann
 
PDF
CMU 2011 Watson Event
Mark Sherman
 
PPTX
Inside the Mind of Watson: Cognitive Computing
Artificial Intelligence Institute at UofSC
 
PDF
2017: The Many Faces of Artificial Intelligence: From AI to Big Data - A Hist...
Leandro de Castro
 
PDF
data history / data science @ NYT
chris wiggins
 
Big, Open, Data and Semantics for Real-World Application Near You
Biplav Srivastava
 
PPT 2.3.1.pptx_PPT 2.3.1.pptx_PPT 2.3.1.pptx
manavvm456
 
Lecture 6: Watson and the Social Web (2014), Chris Welty
Lora Aroyo
 
Question Answering - Application and Challenges
Jens Lehmann
 
CMU 2011 Watson Event
Mark Sherman
 
Inside the Mind of Watson: Cognitive Computing
Artificial Intelligence Institute at UofSC
 
2017: The Many Faces of Artificial Intelligence: From AI to Big Data - A Hist...
Leandro de Castro
 
data history / data science @ NYT
chris wiggins
 
Ad

More from Nandana Mihindukulasooriya (20)

PPTX
A Framework for Linked Data Quality based on Data Profiling and RDF Shape Ind...
Nandana Mihindukulasooriya
 
PDF
Fitur - HackaTrips 2018!
Nandana Mihindukulasooriya
 
PDF
A Distributed Transaction Model for Read-Write Linked Data Applications
Nandana Mihindukulasooriya
 
PDF
Repairing Hidden Links in Linked Data
Nandana Mihindukulasooriya
 
PPTX
Loupe API - A Linked Data Profiling Service for Quality Assessment
Nandana Mihindukulasooriya
 
PDF
Research Poster Design
Nandana Mihindukulasooriya
 
PPTX
Collaborative Ontology Evolution and Data Quality - An Empirical Analysis
Nandana Mihindukulasooriya
 
PPTX
Erasmus+ promotional event - Kandy, Sri Lanka
Nandana Mihindukulasooriya
 
PPTX
Loupe model - Use Cases and Requirements
Nandana Mihindukulasooriya
 
PPTX
4V - WP3 Progress Report (TIN2013-46238)
Nandana Mihindukulasooriya
 
PPTX
Introduction to W3C Linked Data Platform
Nandana Mihindukulasooriya
 
PPTX
A Two-Fold Quality Assurance Approach for Dynamic Knowledge Bases : The 3cixt...
Nandana Mihindukulasooriya
 
PPTX
An analysis of the quality issues of the properties available in the Spanish ...
Nandana Mihindukulasooriya
 
PPTX
Describing LDP Applications with the Hydra Core Vocabulary
Nandana Mihindukulasooriya
 
PPTX
Learning W3C Linked Data Platform with examples
Nandana Mihindukulasooriya
 
PPTX
Linked data platform adapter for bugzilla poster
Nandana Mihindukulasooriya
 
PPTX
LDP4j: A framework for the development of interoperable read-write Linked Da...
Nandana Mihindukulasooriya
 
PDF
morph-LDP: An R2RML-based Linked Data Platform implementation
Nandana Mihindukulasooriya
 
PPTX
Linked Data Platform as a novel approach for Enterprise Application Integra...
Nandana Mihindukulasooriya
 
A Framework for Linked Data Quality based on Data Profiling and RDF Shape Ind...
Nandana Mihindukulasooriya
 
Fitur - HackaTrips 2018!
Nandana Mihindukulasooriya
 
A Distributed Transaction Model for Read-Write Linked Data Applications
Nandana Mihindukulasooriya
 
Repairing Hidden Links in Linked Data
Nandana Mihindukulasooriya
 
Loupe API - A Linked Data Profiling Service for Quality Assessment
Nandana Mihindukulasooriya
 
Research Poster Design
Nandana Mihindukulasooriya
 
Collaborative Ontology Evolution and Data Quality - An Empirical Analysis
Nandana Mihindukulasooriya
 
Erasmus+ promotional event - Kandy, Sri Lanka
Nandana Mihindukulasooriya
 
Loupe model - Use Cases and Requirements
Nandana Mihindukulasooriya
 
4V - WP3 Progress Report (TIN2013-46238)
Nandana Mihindukulasooriya
 
Introduction to W3C Linked Data Platform
Nandana Mihindukulasooriya
 
A Two-Fold Quality Assurance Approach for Dynamic Knowledge Bases : The 3cixt...
Nandana Mihindukulasooriya
 
An analysis of the quality issues of the properties available in the Spanish ...
Nandana Mihindukulasooriya
 
Describing LDP Applications with the Hydra Core Vocabulary
Nandana Mihindukulasooriya
 
Learning W3C Linked Data Platform with examples
Nandana Mihindukulasooriya
 
Linked data platform adapter for bugzilla poster
Nandana Mihindukulasooriya
 
LDP4j: A framework for the development of interoperable read-write Linked Da...
Nandana Mihindukulasooriya
 
morph-LDP: An R2RML-based Linked Data Platform implementation
Nandana Mihindukulasooriya
 
Linked Data Platform as a novel approach for Enterprise Application Integra...
Nandana Mihindukulasooriya
 
Ad

Recently uploaded (20)

PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PPTX
Designing Production-Ready AI Agents
Kunal Rai
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
July Patch Tuesday
Ivanti
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
Designing Production-Ready AI Agents
Kunal Rai
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
July Patch Tuesday
Ivanti
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 

ISWC 2020 - Semantic Answer Type Prediction

  • 2. Question / Answer Type Classification • A popular task in the field of question answering • Question classification based on Wh-terms • Who, What, When, Where, Which, Whom, Whose, Why, How many • Answer type classification • predict the type of the answer • Existing answer type classifications (e.g., TREC QA) use coarse-grained types • 6 types: PERSON, LOCATION, NUMERIC, ENTITY, DESCRIPTION, ABBREVIATION • 50 subtypes: ENTITY -> animal, plant, product, sport, religion, event, food, currency • More fine-grained classifications are possible with Semantic Web ontologies. • DBpedia (~760 classes), Wikidata (~50K classes) Li, Xin, and Dan Roth. "Learning question classifiers: the role of semantic information." Natural Language Engineering 12.3 (2006): 229-249.
  • 3. Knowledge Base Question Answering (KBQA) • Given a natural language question, generate the SPARQL query to find the answer. • Popular datasets for KBQA in the Semantic Web community • Question Answering over Linked Data (QALD) • http://qald.aksw.org/ • Largescale Complex Question Answering Dataset (LC-QuAD) • http://lc-quad.sda.tech/ • Most KBQA systems use some kind of question / answer type predication system. • No standard dataset to evaluate the component performance. Which films did Stanley Kubrick direct? select ?film where { ?film dbo:director dbr:Stanley_Kubrick . } 2001: A Space Odyssey Spartacus Fear and Desire Paths of Glory Lolita …. Answers SPARQL
  • 4. SMART Dataset • A dataset for answer type prediction task using DBpedia and Wikidata ontologies. • Derived using KBQA datasets. • Three main types of questions. Boolean Questions Question: Is Azerbaijan a member of European Go Federation? category: boolean Question: Is Darth Vader Luke’s father? category: boolean Literal Questions Question: How many people live in Poland? Category: literal Type: number Question: When did Shakespeare die? Category: literal Type: date Question: What is the birth name of Angela Merkel? Category: literal Type: string Resource Questions Question: Who is the heaviest player of the Chicago Bulls? Category: resource Type: dbo:BasketballPlayer, dbo:Athlete, dbo:Person Question: Give me video games published by EA? Category: resource Type: dbo:VideoGame, dbo:Software, dbo:Work Question: Who wrote the song Hotel California? Category: resource Type: dbo:MusicalArtist, dbo:Artist, dbo:Person Question: Where did John McCarthy got his PhD from? Type: dbo:University, dbo:EducationalInstitution, dbo:Organization
  • 5. SMART Dataset - II Dataset Questions Training Set Resource answers 9, 584 17, 571 Literal answers 5, 188 Boolean 2, 799 Test Set 4, 393 Total 21,964 Dataset Questions Training Set Resource answers 11, 683 19, 670 Literal answers 5, 188 Boolean 2, 799 Test Set 4,571 Total 24,241
  • 6. Evaluation • Systems can participate for either one or both datasets; each will have separate leader board. • Systems can be rule-based, unsupervised, supervised … • For each test question, the systems should provide • Category and a list of types • Evaluation metric • Lenient NDCG@5/10 with a Linear decay (as defined by Balog and Neumayer) Balog, Krisztian, and Robert Neumayer. "Hierarchical target type identification for entity-oriented queries." (ACM CIKM'12). DCG(type_list) = 𝑖=0 𝑘 𝑟𝑒𝑙𝑒𝑣𝑎𝑛𝑐𝑒 𝑡𝑦𝑝𝑒_𝑝𝑟𝑒𝑑𝑖 𝑙𝑜𝑔2 ( 𝑖 + 1) Relevance(typepred) = 1 − 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 typepred , typegold 𝑀𝑎𝑥 𝑑𝑒𝑝𝑡ℎ
  • 7. Timeline Date Description 6th of May, 2020 Release of training sets. 10th of August, 2020 Release of the test sets. 17th of August, 2020 Submission of system output and system description. 31st of August, 2020 Publication of results and notification of acceptance for presentation. 14 of September, 2020 Camera-ready submission. 2-6 of November, 2020 ISWC Challenge (virtual) at the ISWC Conference Please visit for https://smart-task.github.io/ for more details.
  • 8. Thank You! • We are looking forward to your participation. • Any issues related to dataset, please report at • https://github.com/smart-task/smart-dataset/issues • Please feel free to contact us with any questions/feedback: • Nandana Mihindukulasooriya <[email protected]> • Mohnish Dubey <[email protected]>