A Character Level Convolutional BiLSTM for Arabic Dialect Identification

Mohamed Elaraby, Ahmed Zahran


Abstract
In this paper, we describe CU-RAISA teamcontribution to the 2019Madar shared task2, which focused on Twitter User fine-grained dialect identification. Among par-ticipating teams, our system ranked the4th(with 61.54%) F1-Macro measure. Our sys-tem is trained using a character level convo-lutional bidirectional long-short-term memorynetwork trained on 2k users’ data. We showthat training on concatenated user tweets asinput is further superior to training on usertweets separately and assign user’s label on themode of user’s tweets’ predictions.
Anthology ID:
W19-4636
Volume:
Proceedings of the Fourth Arabic Natural Language Processing Workshop
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Wassim El-Hajj, Lamia Hadrich Belguith, Fethi Bougares, Walid Magdy, Imed Zitouni, Nadi Tomeh, Mahmoud El-Haj, Wajdi Zaghouani
Venue:
WANLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
274–278
Language:
URL:
https://aclanthology.org/W19-4636
DOI:
10.18653/v1/W19-4636
Bibkey:
Cite (ACL):
Mohamed Elaraby and Ahmed Zahran. 2019. A Character Level Convolutional BiLSTM for Arabic Dialect Identification. In Proceedings of the Fourth Arabic Natural Language Processing Workshop, pages 274–278, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
A Character Level Convolutional BiLSTM for Arabic Dialect Identification (Elaraby & Zahran, WANLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-4636.pdf