The GW/LT3 VarDial 2016 Shared Task System for Dialects and Similar Languages Detection

Ayah Zirikly, Bart Desmet, Mona Diab


Abstract
This paper describes the GW/LT3 contribution to the 2016 VarDial shared task on the identification of similar languages (task 1) and Arabic dialects (task 2). For both tasks, we experimented with Logistic Regression and Neural Network classifiers in isolation. Additionally, we implemented a cascaded classifier that consists of coarse and fine-grained classifiers (task 1) and a classifier ensemble with majority voting for task 2. The submitted systems obtained state-of-the art performance and ranked first for the evaluation on social media data (test sets B1 and B2 for task 1), with a maximum weighted F1 score of 91.94%.
Anthology ID:
W16-4804
Volume:
Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3)
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Preslav Nakov, Marcos Zampieri, Liling Tan, Nikola Ljubešić, Jörg Tiedemann, Shervin Malmasi
Venue:
VarDial
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
33–41
Language:
URL:
https://aclanthology.org/W16-4804
DOI:
Bibkey:
Cite (ACL):
Ayah Zirikly, Bart Desmet, and Mona Diab. 2016. The GW/LT3 VarDial 2016 Shared Task System for Dialects and Similar Languages Detection. In Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), pages 33–41, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
The GW/LT3 VarDial 2016 Shared Task System for Dialects and Similar Languages Detection (Zirikly et al., VarDial 2016)
Copy Citation:
PDF:
https://aclanthology.org/W16-4804.pdf