Cross-Lingual and Cross-Domain Transfer Learning for POS Tagging in Historical Germanic Low-Resource Languages

Irene Miani, Sara Stymne, Gregory R. Darwin


Abstract
Although Part-of-Speech (POS) tagging has been widely studied, it still presents several challenges, particularly reduced performance on out-of-domain data. While increasing in-domain training data can be effective, this strategy is often impractical in historical low-resource settings. Cross-lingual transfer learning has shown promise for low-resource languages; however, its impact on domain generalization has received limited attention and may remain insufficient when used in isolation. This study focuses on cross-lingual and cross-domain transfer learning for POS tagging on four historical Germanic low-resource languages in two literary genres. For each language, POS tagged data were extracted and mapped to the Universal Dependencies UPOS tag set to establish a monolingual baseline and train three multilingual models in two dataset configurations. The results were consistent with previous findings, indicating that structural differences between the genres can negatively influence transfer learning. The poetry-only multilingual model showed improvements within that domain compared to the baseline. In contrast, multilingual models trained with all available data had lower performance caused by substantial structural differences in the corpora. This study underlines the importance of investigating the domain-generalization abilities of the models, which may be negatively influenced by substantial structural differences between data. In addition, it sheds light on the study of historical low-resource languages.
Anthology ID:
2026.loreslm-1.47
Volume:
Proceedings of the Second Workshop on Language Models for Low-Resource Languages (LoResLM 2026)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Hansi Hettiarachchi, Tharindu Ranasinghe, Alistair Plum, Paul Rayson, Ruslan Mitkov, Mohamed Gaber, Damith Premasiri, Fiona Anting Tan, Lasitha Uyangodage
Venue:
LoResLM
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
542–558
Language:
URL:
https://aclanthology.org/2026.loreslm-1.47/
DOI:
Bibkey:
Cite (ACL):
Irene Miani, Sara Stymne, and Gregory R. Darwin. 2026. Cross-Lingual and Cross-Domain Transfer Learning for POS Tagging in Historical Germanic Low-Resource Languages. In Proceedings of the Second Workshop on Language Models for Low-Resource Languages (LoResLM 2026), pages 542–558, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
Cross-Lingual and Cross-Domain Transfer Learning for POS Tagging in Historical Germanic Low-Resource Languages (Miani et al., LoResLM 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.loreslm-1.47.pdf