Sub-label dependencies for Neural Morphological Tagging – The Joint Submission of University of Colorado and University of Helsinki for VarDial 2018

Miikka Silfverberg, Senka Drobac


Abstract
This paper presents the submission of the UH&CU team (Joint University of Colorado and University of Helsinki team) for the VarDial 2018 shared task on morphosyntactic tagging of Croatian, Slovenian and Serbian tweets. Our system is a bidirectional LSTM tagger which emits tags as character sequences using an LSTM generator in order to be able to handle unknown tags and combinations of several tags for one token which occur in the shared task data sets. To the best of our knowledge, using an LSTM generator is a novel approach. The system delivers sizable improvements of more than 6%-points over a baseline trigram tagger. Overall, the performance of our system is quite even for all three languages.
Anthology ID:
W18-3904
Volume:
Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018)
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Editors:
Marcos Zampieri, Preslav Nakov, Nikola Ljubešić, Jörg Tiedemann, Shervin Malmasi, Ahmed Ali
Venue:
VarDial
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
37–45
Language:
URL:
https://aclanthology.org/W18-3904
DOI:
Bibkey:
Cite (ACL):
Miikka Silfverberg and Senka Drobac. 2018. Sub-label dependencies for Neural Morphological Tagging – The Joint Submission of University of Colorado and University of Helsinki for VarDial 2018. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018), pages 37–45, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Sub-label dependencies for Neural Morphological Tagging – The Joint Submission of University of Colorado and University of Helsinki for VarDial 2018 (Silfverberg & Drobac, VarDial 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-3904.pdf
Data
MULTEXT-East