TheNorth at SemEval-2020 Task 12: Hate Speech Detection Using RoBERTa

Pedro Alonso, Rajkumar Saini, György Kovacs


Abstract
Hate speech detection on social media platforms is crucial as it helps to avoid severe situations, and severe harm to marginalized people and groups. The application of Natural Language Processing(NLP) and Deep Learning has garnered encouraging results in the task of hate speech detection. The expression of hate, however is varied and ever evolving. Thus, it is important for better detection systems to adapt to this variance. Because of this, researchers keep on collecting data and regularly come up with hate speech detection competitions. In this paper, we discuss our entry to one such competition, namely the English version of sub-task A for the OffensEval competition. Our contribution can be perceived through our results, which were first a F1-score of 0.9089, and with further refinements described here climb up to0.9166. It serves to give more support to our hypothesis that one of the variants of BERT (Devlin et al., 2018), namely RoBERTa can successfully differentiate between offensive and not-offensive tweets, given some preprocessing steps (also outlined here).
Anthology ID:
2020.semeval-1.292
Volume:
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Month:
December
Year:
2020
Address:
Barcelona (online)
Venues:
COLING | SemEval
SIGs:
SIGSEM | SIGLEX
Publisher:
International Committee for Computational Linguistics
Note:
Pages:
2197–2202
Language:
URL:
https://aclanthology.org/2020.semeval-1.292
DOI:
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2020.semeval-1.292.pdf