Evaluation Framework for Transfer Learning between Closely Related Lects: A Case Study of Lemko

Ilia Afanasev


Abstract
The creation of a robust evaluation methodology is one of the pivotal issues for transfer learning between closely related lects. The current study proposes to resolve this issue by concisely implementing a group of evaluation methods that enable a more systematic qualitative analysis of errata (for instance, string similarity measures to assess lemmatisation more effectively). The paper introduces a robustness score, a metric that aims to assess the stabilityof model performance across different datasets. The case study is a morphosyntactic tagging of a small historical (beginning of the twentieth century) corpus of Lemko (Slavic clade, Transcarpathian area). It presents a diversity of cross-dependent tasks, made rather complex by the rich Lemko morphology, highly influenced by areal convergence processes. The tagger is a pre-trained Stanza. The study uses modern standard Ukrainian as the source language, as it is the closest to the Lemko high-resource lect. The analysis reveals that linguistically-aware metrics improve the speed and accuracy of analysis of the errata, especially those caused by the differences between source and target lects. The key data contribution is the open- source dataset of Lemko, obtained during the tagging tasks. Future research directions include a larger-scale test that applies more models to a more extensive material.
Anthology ID:
2026.vardial-1.25
Volume:
Proceedings of the 13th Workshop on NLP for Similar Languages, Varieties and Dialects
Month:
March
Year:
2026
Address:
Rabat, Morocco
Venues:
VarDial | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
304–316
Language:
URL:
https://aclanthology.org/2026.vardial-1.25/
DOI:
Bibkey:
Cite (ACL):
Ilia Afanasev. 2026. Evaluation Framework for Transfer Learning between Closely Related Lects: A Case Study of Lemko. In Proceedings of the 13th Workshop on NLP for Similar Languages, Varieties and Dialects, pages 304–316, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
Evaluation Framework for Transfer Learning between Closely Related Lects: A Case Study of Lemko (Afanasev, VarDial 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.vardial-1.25.pdf