Towards Weakly-Supervised Hate Speech Classification Across Datasets

Yiping Jin, Leo Wanner, Vishakha Kadam, Alexander Shvets


Abstract
As pointed out by several scholars, current research on hate speech (HS) recognition is characterized by unsystematic data creation strategies and diverging annotation schemata. Subsequently, supervised-learning models tend to generalize poorly to datasets they were not trained on, and the performance of the models trained on datasets labeled using different HS taxonomies cannot be compared. To ease this problem, we propose applying extremely weak supervision that only relies on the class name rather than on class samples from the annotated data. We demonstrate the effectiveness of a state-of-the-art weakly-supervised text classification model in various in-dataset and cross-dataset settings. Furthermore, we conduct an in-depth quantitative and qualitative analysis of the source of poor generalizability of HS classification models.
Anthology ID:
2023.woah-1.4
Volume:
The 7th Workshop on Online Abuse and Harms (WOAH)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Yi-ling Chung, Paul R{\"ottger}, Debora Nozza, Zeerak Talat, Aida Mostafazadeh Davani
Venue:
WOAH
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
42–59
Language:
URL:
https://aclanthology.org/2023.woah-1.4
DOI:
10.18653/v1/2023.woah-1.4
Bibkey:
Cite (ACL):
Yiping Jin, Leo Wanner, Vishakha Kadam, and Alexander Shvets. 2023. Towards Weakly-Supervised Hate Speech Classification Across Datasets. In The 7th Workshop on Online Abuse and Harms (WOAH), pages 42–59, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Towards Weakly-Supervised Hate Speech Classification Across Datasets (Jin et al., WOAH 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.woah-1.4.pdf