Using Neural Transfer Learning for Morpho-syntactic Tagging of South-Slavic Languages Tweets

Sara Meftah; Nasredine Semmar; Fatiha Sadat; Stephan Raaijmakers

Using Neural Transfer Learning for Morpho-syntactic Tagging of South-Slavic Languages Tweets

Sara Meftah, Nasredine Semmar, Fatiha Sadat, Stephan Raaijmakers

Abstract

In this paper, we describe a morpho-syntactic tagger of tweets, an important component of the CEA List DeepLIMA tool which is a multilingual text analysis platform based on deep learning. This tagger is built for the Morpho-syntactic Tagging of Tweets (MTT) Shared task of the 2018 VarDial Evaluation Campaign. The MTT task focuses on morpho-syntactic annotation of non-canonical Twitter varieties of three South-Slavic languages: Slovene, Croatian and Serbian. We propose to use a neural network model trained in an end-to-end manner for the three languages without any need for task or domain specific features engineering. The proposed approach combines both character and word level representations. Considering the lack of annotated data in the social media domain for South-Slavic languages, we have also implemented a cross-domain Transfer Learning (TL) approach to exploit any available related out-of-domain annotated data.

Anthology ID:: W18-3927
Volume:: Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018)
Month:: August
Year:: 2018
Address:: Santa Fe, New Mexico, USA
Editors:: Marcos Zampieri, Preslav Nakov, Nikola Ljubešić, Jörg Tiedemann, Shervin Malmasi, Ahmed Ali
Venue:: VarDial
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 235–243
Language:
URL:: https://aclanthology.org/W18-3927/
DOI:
Bibkey:
Cite (ACL):: Sara Meftah, Nasredine Semmar, Fatiha Sadat, and Stephan Raaijmakers. 2018. Using Neural Transfer Learning for Morpho-syntactic Tagging of South-Slavic Languages Tweets. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018), pages 235–243, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):: Using Neural Transfer Learning for Morpho-syntactic Tagging of South-Slavic Languages Tweets (Meftah et al., VarDial 2018)
Copy Citation:
PDF:: https://aclanthology.org/W18-3927.pdf

PDF Cite Search Fix data