TheNorth at SemEval-2020 Task 12: Hate Speech Detection Using RoBERTa

Pedro Alonso; Rajkumar Saini; György Kovács

doi:10.18653/v1/2020.semeval-1.292

TheNorth at SemEval-2020 Task 12: Hate Speech Detection Using RoBERTa

Pedro Alonso, Rajkumar Saini, György Kovacs

Abstract

Hate speech detection on social media platforms is crucial as it helps to avoid severe situations, and severe harm to marginalized people and groups. The application of Natural Language Processing(NLP) and Deep Learning has garnered encouraging results in the task of hate speech detection. The expression of hate, however is varied and ever evolving. Thus, it is important for better detection systems to adapt to this variance. Because of this, researchers keep on collecting data and regularly come up with hate speech detection competitions. In this paper, we discuss our entry to one such competition, namely the English version of sub-task A for the OffensEval competition. Our contribution can be perceived through our results, which were first a F1-score of 0.9089, and with further refinements described here climb up to0.9166. It serves to give more support to our hypothesis that one of the variants of BERT (Devlin et al., 2018), namely RoBERTa can successfully differentiate between offensive and not-offensive tweets, given some preprocessing steps (also outlined here).

Anthology ID:: 2020.semeval-1.292
Volume:: Proceedings of the Fourteenth Workshop on Semantic Evaluation
Month:: December
Year:: 2020
Address:: Barcelona (online)
Editors:: Aurelie Herbelot, Xiaodan Zhu, Alexis Palmer, Nathan Schneider, Jonathan May, Ekaterina Shutova
Venue:: SemEval
SIG:: SIGLEX
Publisher:: International Committee for Computational Linguistics
Note:
Pages:: 2197–2202
Language:
URL:: https://aclanthology.org/2020.semeval-1.292/
DOI:: 10.18653/v1/2020.semeval-1.292
Bibkey:
Cite (ACL):: Pedro Alonso, Rajkumar Saini, and György Kovacs. 2020. TheNorth at SemEval-2020 Task 12: Hate Speech Detection Using RoBERTa. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 2197–2202, Barcelona (online). International Committee for Computational Linguistics.
Cite (Informal):: TheNorth at SemEval-2020 Task 12: Hate Speech Detection Using RoBERTa (Alonso et al., SemEval 2020)
Copy Citation:
PDF:: https://aclanthology.org/2020.semeval-1.292.pdf

PDF Cite Search Fix data