Recipe for Zero-shot POS Tagging: Is It Useful in Realistic Scenarios?

Zeno Vandenbulcke; Lukas Vermeire; Miryam de Lhoneux

doi:10.18653/v1/2024.mrl-1.9

Recipe for Zero-shot POS Tagging: Is It Useful in Realistic Scenarios?

Zeno Vandenbulcke, Lukas Vermeire, Miryam de Lhoneux

Abstract

POS tagging plays a fundamental role in numerous applications. While POS taggers are highly accurate in well-resourced settings, they lag behind in cases of limited or missing training data. This paper focuses on POS tagging for languages with limited data. We seek to identify favourable characteristics of datasets for training POS tagging models using related languages without specific training on the target language. This is a zero-shot approach. We investigate both mono- and multilingual models trained on related languages and compare their accuracies. Additionally, we compare these results with models trained directly on the target language itself. We do this for three target low-resource languages, for each of which we select several support languages. Our research highlights the importance of accurate dataset selection for developing effective zero-shot POS tagging models. Particularly, a strong linguistic relationship and high-quality datasets ensure optimal results. For extremely low-resource languages, zero-shot training proves to be a viable option.

Anthology ID:: 2024.mrl-1.9
Volume:: Proceedings of the Fourth Workshop on Multilingual Representation Learning (MRL 2024)
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Jonne Sälevä, Abraham Owodunni
Venue:: MRL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 137–147
Language:
URL:: https://aclanthology.org/2024.mrl-1.9/
DOI:: 10.18653/v1/2024.mrl-1.9
Bibkey:
Cite (ACL):: Zeno Vandenbulcke, Lukas Vermeire, and Miryam de Lhoneux. 2024. Recipe for Zero-shot POS Tagging: Is It Useful in Realistic Scenarios?. In Proceedings of the Fourth Workshop on Multilingual Representation Learning (MRL 2024), pages 137–147, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: Recipe for Zero-shot POS Tagging: Is It Useful in Realistic Scenarios? (Vandenbulcke et al., MRL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.mrl-1.9.pdf

PDF Cite Search Fix data