UoT at NADI 2023 shared task: Automatic Arabic Dialect Identification is Made Possible

Abduslam F A Nwesri, Nabila A S Shinbir, Hassan Ebrahem


Abstract
In this paper we present our approach towards Arabic Dialect identification which was part of the The Fourth Nuanced Arabic Dialect Identification Shared Task (NADI 2023). We tested several techniques to identify Arabic dialects. We obtained the best result by fine-tuning the pre-trained MARBERTv2 model with a modified training dataset. The training set was expanded by sorting tweets based on dialects, concatenating every two adjacent tweets, and adding them to the original dataset as new tweets. We achieved 82.87 on F1 score and we were at the seventh position among 16 participants.
Anthology ID:
2023.arabicnlp-1.64
Volume:
Proceedings of ArabicNLP 2023
Month:
December
Year:
2023
Address:
Singapore (Hybrid)
Editors:
Hassan Sawaf, Samhaa El-Beltagy, Wajdi Zaghouani, Walid Magdy, Ahmed Abdelali, Nadi Tomeh, Ibrahim Abu Farha, Nizar Habash, Salam Khalifa, Amr Keleg, Hatem Haddad, Imed Zitouni, Khalil Mrini, Rawan Almatham
Venues:
ArabicNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
620–624
Language:
URL:
https://aclanthology.org/2023.arabicnlp-1.64
DOI:
10.18653/v1/2023.arabicnlp-1.64
Bibkey:
Cite (ACL):
Abduslam F A Nwesri, Nabila A S Shinbir, and Hassan Ebrahem. 2023. UoT at NADI 2023 shared task: Automatic Arabic Dialect Identification is Made Possible. In Proceedings of ArabicNLP 2023, pages 620–624, Singapore (Hybrid). Association for Computational Linguistics.
Cite (Informal):
UoT at NADI 2023 shared task: Automatic Arabic Dialect Identification is Made Possible (Nwesri et al., ArabicNLP-WS 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.arabicnlp-1.64.pdf
Video:
 https://aclanthology.org/2023.arabicnlp-1.64.mp4