CHILLAX - at Arabic Hate Speech 2022: A Hybrid Machine Learning and Transformers based Model to Detect Arabic Offensive and Hate Speech

Kirollos Makram, Kirollos George Nessim, Malak Emad Abd-Almalak, Shady Zekry Roshdy, Seif Hesham Salem, Fady Fayek Thabet, Ensaf Hussien Mohamed


Abstract
Hate speech and offensive language have become a crucial problem nowadays due to the extensive usage of social media by people of different gender, nationality, religion and other types of characteristics allowing anyone to share their thoughts and opinions. In this research paper, We proposed a hybrid model for the first and second tasks of OSACT2022. This model used the Arabic pre-trained Bert language model MARBERT for feature extraction of the Arabic tweets in the dataset provided by the OSACT2022 shared task, then fed the features to two classic machine learning classifiers (Logistic Regression, Random Forest). The best results achieved for the offensive tweet detection task were by the Logistic Regression model with accuracy, precision, recall, and f1-score of 80%, 78%, 78%, and 78%, respectively. The results for the hate speech tweet detection task were 89%, 72%, 80%, and 76%.
Anthology ID:
2022.osact-1.25
Volume:
Proceedinsg of the 5th Workshop on Open-Source Arabic Corpora and Processing Tools with Shared Tasks on Qur'an QA and Fine-Grained Hate Speech Detection
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Hend Al-Khalifa, Tamer Elsayed, Hamdy Mubarak, Abdulmohsen Al-Thubaity, Walid Magdy, Kareem Darwish
Venue:
OSACT
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
194–199
Language:
URL:
https://aclanthology.org/2022.osact-1.25
DOI:
Bibkey:
Cite (ACL):
Kirollos Makram, Kirollos George Nessim, Malak Emad Abd-Almalak, Shady Zekry Roshdy, Seif Hesham Salem, Fady Fayek Thabet, and Ensaf Hussien Mohamed. 2022. CHILLAX - at Arabic Hate Speech 2022: A Hybrid Machine Learning and Transformers based Model to Detect Arabic Offensive and Hate Speech. In Proceedinsg of the 5th Workshop on Open-Source Arabic Corpora and Processing Tools with Shared Tasks on Qur'an QA and Fine-Grained Hate Speech Detection, pages 194–199, Marseille, France. European Language Resources Association.
Cite (Informal):
CHILLAX - at Arabic Hate Speech 2022: A Hybrid Machine Learning and Transformers based Model to Detect Arabic Offensive and Hate Speech (Makram et al., OSACT 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.osact-1.25.pdf