Label Propagation-Based Semi-Supervised Learning for Hate Speech Classification

Ashwin Geet D’Sa, Irina Illina, Dominique Fohr, Dietrich Klakow, Dana Ruiter


Abstract
Research on hate speech classification has received increased attention. In real-life scenarios, a small amount of labeled hate speech data is available to train a reliable classifier. Semi-supervised learning takes advantage of a small amount of labeled data and a large amount of unlabeled data. In this paper, label propagation-based semi-supervised learning is explored for the task of hate speech classification. The quality of labeling the unlabeled set depends on the input representations. In this work, we show that pre-trained representations are label agnostic, and when used with label propagation yield poor results. Neural network-based fine-tuning can be adopted to learn task-specific representations using a small amount of labeled data. We show that fully fine-tuned representations may not always be the best representations for the label propagation and intermediate representations may perform better in a semi-supervised setup.
Anthology ID:
2020.insights-1.8
Volume:
Proceedings of the First Workshop on Insights from Negative Results in NLP
Month:
November
Year:
2020
Address:
Online
Venue:
insights
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
54–59
Language:
URL:
https://aclanthology.org/2020.insights-1.8
DOI:
10.18653/v1/2020.insights-1.8
Bibkey:
Cite (ACL):
Ashwin Geet D’Sa, Irina Illina, Dominique Fohr, Dietrich Klakow, and Dana Ruiter. 2020. Label Propagation-Based Semi-Supervised Learning for Hate Speech Classification. In Proceedings of the First Workshop on Insights from Negative Results in NLP, pages 54–59, Online. Association for Computational Linguistics.
Cite (Informal):
Label Propagation-Based Semi-Supervised Learning for Hate Speech Classification (D’Sa et al., insights 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.insights-1.8.pdf
Video:
 https://slideslive.com/38940795