Hierarchical Deep Learning for Arabic Dialect Identification

Gael de Francony, Victor Guichard, Praveen Joshi, Haithem Afli, Abdessalam Bouchekif


Abstract
In this paper, we present two approaches for Arabic Fine-Grained Dialect Identification. The first approach is based on Recurrent Neural Networks (BLSTM, BGRU) using hierarchical classification. The main idea is to separate the classification process for a sentence from a given text in two stages. We start with a higher level of classification (8 classes) and then the finer-grained classification (26 classes). The second approach is given by a voting system based on Naive Bayes and Random Forest. Our system achieves an F1 score of 63.02 % on the subtask evaluation dataset.
Anthology ID:
W19-4631
Volume:
Proceedings of the Fourth Arabic Natural Language Processing Workshop
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Wassim El-Hajj, Lamia Hadrich Belguith, Fethi Bougares, Walid Magdy, Imed Zitouni, Nadi Tomeh, Mahmoud El-Haj, Wajdi Zaghouani
Venue:
WANLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
249–253
Language:
URL:
https://aclanthology.org/W19-4631
DOI:
10.18653/v1/W19-4631
Bibkey:
Cite (ACL):
Gael de Francony, Victor Guichard, Praveen Joshi, Haithem Afli, and Abdessalam Bouchekif. 2019. Hierarchical Deep Learning for Arabic Dialect Identification. In Proceedings of the Fourth Arabic Natural Language Processing Workshop, pages 249–253, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Hierarchical Deep Learning for Arabic Dialect Identification (de Francony et al., WANLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-4631.pdf