Zoher Orabe
2020
Arabic Offensive Language Detection with Attention-based Deep Neural Networks
Bushr Haddad
|
Zoher Orabe
|
Anas Al-Abood
|
Nada Ghneim
Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection
In this paper, we tackle the problem of offensive language and hate speech detection. We proposed our methods for data preprocessing and balancing, and then we presented our Convolutional Neural Network (CNN) and bidirectional Gated Recurrent Unit (GRU) models used. After that, we augmented these models with attention layer. The best results achieved was using the Bidirectional Gated Recurrent Unit augmented with attention layer (Bi-GRU_ATT). Keywords: Abusive Language, Text Mining, Arabic Language, Social Media Mining, Deep Learning, Convolutional Neural Network, Gated Recurrent Unit, Attention Mechanism, Machine Learning.
DoTheMath at SemEval-2020 Task 12 : Deep Neural Networks with Self Attention for Arabic Offensive Language Detection
Zoher Orabe
|
Bushr Haddad
|
Nada Ghneim
|
Anas Al-Abood
Proceedings of the Fourteenth Workshop on Semantic Evaluation
This paper describes our team work and submission for the SemEval 2020 (Sub-Task A) “Offensive Eval: Identifying and Categorizing Offensive Arabic Language in Arabic Social Media”. Our two baseline models were based on different levels of representation: character vs. word level. In word level based representation we implemented a convolutional neural network model and a bi-directional GRU model. In character level based representation we implemented a hyper CNN and LSTM model. All of these models have been further augmented with attention layers for a better performance on our task. We also experimented with three types of static word embeddings: word2vec, FastText, and Glove, in addition to emoji embeddings, and compared the performance of the different deep learning models on the dataset provided by this task. The bi-directional GRU model with attention has achieved the highest score (0.85% F1 score) among all other models.
Search