Analyzing Hate Speech Data along Racial, Gender and Intersectional Axes

Antonis Maronikolakis, Philip Baader, Hinrich Schütze


Abstract
To tackle the rising phenomenon of hate speech, efforts have been made towards data curation and analysis. When it comes to analysis of bias, previous work has focused predominantly on race. In our work, we further investigate bias in hate speech datasets along racial, gender and intersectional axes. We identify strong bias against African American English (AAE), masculine and AAE+Masculine tweets, which are annotated as disproportionately more hateful and offensive than from other demographics. We provide evidence that BERT-based models propagate this bias and show that balancing the training data for these protected attributes can lead to fairer models with regards to gender, but not race.
Anthology ID:
2022.gebnlp-1.1
Volume:
Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)
Month:
July
Year:
2022
Address:
Seattle, Washington
Venue:
GeBNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–7
Language:
URL:
https://aclanthology.org/2022.gebnlp-1.1
DOI:
10.18653/v1/2022.gebnlp-1.1
Bibkey:
Cite (ACL):
Antonis Maronikolakis, Philip Baader, and Hinrich Schütze. 2022. Analyzing Hate Speech Data along Racial, Gender and Intersectional Axes. In Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP), pages 1–7, Seattle, Washington. Association for Computational Linguistics.
Cite (Informal):
Analyzing Hate Speech Data along Racial, Gender and Intersectional Axes (Maronikolakis et al., GeBNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.gebnlp-1.1.pdf