Targeted Identity Group Prediction in Hate Speech Corpora

Pratik Sachdeva, Renata Barreto, Claudia Von Vacano, Chris Kennedy


Abstract
The past decade has seen an abundance of work seeking to detect, characterize, and measure online hate speech. A related, but less studied problem, is the detection of identity groups targeted by that hate speech. Predictive accuracy on this task can supplement additional analyses beyond hate speech detection, motivating its study. Using the Measuring Hate Speech corpus, which provided annotations for targeted identity groups, we created neural network models to perform multi-label binary prediction of identity groups targeted by a comment. Specifically, we studied 8 broad identity groups and 12 identity sub-groups within race and gender identity. We found that these networks exhibited good predictive performance, achieving ROC AUCs of greater than 0.9 and PR AUCs of greater than 0.7 on several identity groups. We validated their performance on HateCheck and Gab Hate Corpora, finding that predictive performance generalized in most settings. We additionally examined the performance of the model on comments targeting multiple identity groups. Our results demonstrate the feasibility of simultaneously identifying targeted groups in social media comments.
Anthology ID:
2022.woah-1.22
Volume:
Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)
Month:
July
Year:
2022
Address:
Seattle, Washington (Hybrid)
Editors:
Kanika Narang, Aida Mostafazadeh Davani, Lambert Mathias, Bertie Vidgen, Zeerak Talat
Venue:
WOAH
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
231–244
Language:
URL:
https://aclanthology.org/2022.woah-1.22
DOI:
10.18653/v1/2022.woah-1.22
Bibkey:
Cite (ACL):
Pratik Sachdeva, Renata Barreto, Claudia Von Vacano, and Chris Kennedy. 2022. Targeted Identity Group Prediction in Hate Speech Corpora. In Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH), pages 231–244, Seattle, Washington (Hybrid). Association for Computational Linguistics.
Cite (Informal):
Targeted Identity Group Prediction in Hate Speech Corpora (Sachdeva et al., WOAH 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.woah-1.22.pdf
Code
 dlab-projects/hate_target