Is Your Toxicity My Toxicity? Exploring the Impact of Rater Identity on Toxicity Annotation

Goyal, Nitesh; Kivlichan, Ian; Rosen, Rachel; Vasserman, Lucy

Computer Science > Human-Computer Interaction

arXiv:2205.00501 (cs)

[Submitted on 1 May 2022]

Title:Is Your Toxicity My Toxicity? Exploring the Impact of Rater Identity on Toxicity Annotation

Authors:Nitesh Goyal, Ian Kivlichan, Rachel Rosen, Lucy Vasserman

View PDF

Abstract:Machine learning models are commonly used to detect toxicity in online conversations. These models are trained on datasets annotated by human raters. We explore how raters' self-described identities impact how they annotate toxicity in online comments. We first define the concept of specialized rater pools: rater pools formed based on raters' self-described identities, rather than at random. We formed three such rater pools for this study--specialized rater pools of raters from the U.S. who identify as African American, LGBTQ, and those who identify as neither. Each of these rater pools annotated the same set of comments, which contains many references to these identity groups. We found that rater identity is a statistically significant factor in how raters will annotate toxicity for identity-related annotations. Using preliminary content analysis, we examined the comments with the most disagreement between rater pools and found nuanced differences in the toxicity annotations. Next, we trained models on the annotations from each of the different rater pools, and compared the scores of these models on comments from several test sets. Finally, we discuss how using raters that self-identify with the subjects of comments can create more inclusive machine learning models, and provide more nuanced ratings than those by random raters.

Comments:	Proceedings of ACM in Human Computer Interaction in ACM Conference On Computer- Supported Cooperative Work And Social Computing CSCW 2022
Subjects:	Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2205.00501 [cs.HC]
	(or arXiv:2205.00501v1 [cs.HC] for this version)
	https://doi.org/10.48550/arXiv.2205.00501

Submission history

From: Nitesh Goyal [view email]
[v1] Sun, 1 May 2022 16:08:48 UTC (3,580 KB)

Computer Science > Human-Computer Interaction

Title:Is Your Toxicity My Toxicity? Exploring the Impact of Rater Identity on Toxicity Annotation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Human-Computer Interaction

Title:Is Your Toxicity My Toxicity? Exploring the Impact of Rater Identity on Toxicity Annotation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators