Multitask Learning for Arabic Offensive Language and Hate-Speech Detection

Ibrahim Abu Farha, Walid Magdy


Abstract
Offensive language and hate-speech are phenomena that spread with the rising popularity of social media. Detecting such content is crucial for understanding and predicting conflicts, understanding polarisation among communities and providing means and tools to filter or block inappropriate content. This paper describes the SMASH team submission to OSACT4’s shared task on hate-speech and offensive language detection, where we explore different approaches to perform these tasks. The experiments cover a variety of approaches that include deep learning, transfer learning and multitask learning. We also explore the utilisation of sentiment information to perform the previous task. Our best model is a multitask learning architecture, based on CNN-BiLSTM, that was trained to detect hate-speech and offensive language and predict sentiment.
Anthology ID:
2020.osact-1.14
Volume:
Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection
Month:
May
Year:
2020
Address:
Marseille, France
Venues:
LREC | OSACT | WS
SIG:
Publisher:
European Language Resource Association
Note:
Pages:
86–90
Language:
English
URL:
https://aclanthology.org/2020.osact-1.14
DOI:
Bibkey:
Cite (ACL):
Ibrahim Abu Farha and Walid Magdy. 2020. Multitask Learning for Arabic Offensive Language and Hate-Speech Detection. In Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, pages 86–90, Marseille, France. European Language Resource Association.
Cite (Informal):
Multitask Learning for Arabic Offensive Language and Hate-Speech Detection (Abu Farha & Magdy, OSACT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.osact-1.14.pdf