DTeam @ VarDial 2019: Ensemble based on skip-gram and triplet loss neural networks for Moldavian vs. Romanian cross-dialect topic identification

Diana Tudoreanu


Abstract
This paper presents the solution proposed by DTeam in the VarDial 2019 Evaluation Campaign for the Moldavian vs. Romanian cross-topic identification task. The solution proposed is a Support Vector Machines (SVM) ensemble composed of a two character-level neural networks. The first network is a skip-gram classification model formed of an embedding layer, three convolutional layers and two fully-connected layers. The second network has a similar architecture, but is trained using the triplet loss function.
Anthology ID:
W19-1422
Volume:
Proceedings of the Sixth Workshop on NLP for Similar Languages, Varieties and Dialects
Month:
June
Year:
2019
Address:
Ann Arbor, Michigan
Editors:
Marcos Zampieri, Preslav Nakov, Shervin Malmasi, Nikola Ljubešić, Jörg Tiedemann, Ahmed Ali
Venue:
VarDial
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
202–208
Language:
URL:
https://aclanthology.org/W19-1422
DOI:
10.18653/v1/W19-1422
Bibkey:
Cite (ACL):
Diana Tudoreanu. 2019. DTeam @ VarDial 2019: Ensemble based on skip-gram and triplet loss neural networks for Moldavian vs. Romanian cross-dialect topic identification. In Proceedings of the Sixth Workshop on NLP for Similar Languages, Varieties and Dialects, pages 202–208, Ann Arbor, Michigan. Association for Computational Linguistics.
Cite (Informal):
DTeam @ VarDial 2019: Ensemble based on skip-gram and triplet loss neural networks for Moldavian vs. Romanian cross-dialect topic identification (Tudoreanu, VarDial 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-1422.pdf
Data
MOROCO