Studying Topical Relevance with Evidence-based Crowdsourcing

Inel, O.; Haralabopoulos, G.; Li, D.; Van Gysel, C.; Szlávik, Z.; Simperl, E.; Kanoulas, E.; Aroyo, L.

doi:https://doi.org/10.1145/3269206.3271779

item 1 out of 1

return to search results

Author: O. Inel
G. Haralabopoulos
D. Li
C. Van Gysel
Z. Szlávik
E. Simperl
E. Kanoulas
L. Aroyo
Year: 2018
Title: Studying Topical Relevance with Evidence-based Crowdsourcing
Event: 27th ACM International Conference on Information and Knowledge Management
Book/source title: CIKM '18
Book/source subtitle: proceedings of the 2018 ACM International Conference on Information and Knowledge Management : October 22-26, 2018, Torino, Italy
Pages (from-to): 1253-1262
Number of pages: 10
Publisher: New York, NY: The Association for Computing Machinery
ISBN (electronic): 9781450360142
Document type: Conference contribution
Faculty: Faculty of Science (FNWI)
Faculty of Economics and Business (FEB)
Institute: Informatics Institute (IVI)
Amsterdam Business School Research Institute (ABS-RI)
Abstract: Information Retrieval systems rely on large test collections to measure their effectiveness in retrieving relevant documents. While the demand is high, the task of creating such test collections is laborious due to the large amounts of data that need to be annotated, and due to the intrinsic subjectivity of the task itself. In this paper we study the topical relevance from a user perspective by addressing the problems of subjectivity and ambiguity. We compare our approach and results with the established TREC annotation guidelines and results. The comparison is based on a series of crowdsourcing pilots experimenting with variables, such as relevance scale, document granularity, annotation template and the number of workers. Our results show correlation between relevance assessment accuracy and smaller document granularity, i.e., aggregation of relevance on paragraph level results in a better relevance accuracy, compared to assessment done at the level of the full document. As expected, our results also show that collecting binary relevance judgments results in a higher accuracy compared to the ternary scale used in the TREC annotation guidelines. Finally, the crowdsourced annotation tasks provided a more accurate document relevance ranking than a single assessor relevance label. This work resulted is a reliable test collection around the TREC Common Core track.
URL: go to publisher's site
Language: English
Persistent Identifier: https://hdl.handle.net/11245.1/17c858fe-aa78-461a-a51a-f66c7520e494

Downloads

p1253-inel(Final published version)

Disclaimer/Complaints regulations

If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library, or send a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.

p1253-inel(Final published version)

Disclaimer/Complaints regulations