DevLake at LoResMT 2026: The Impact of Pre-training and Model Scale on Russian-Bashkir Low-Resource Translation

Vyacheslav Tyurin


Abstract
This paper describes the submission of Team DevLake for the LoResMT 2026 Shared Task on Russian-Bashkir machine translation. We conducted a comprehensive comparative study of three distinct neural architectures: NLLB-200 (1.3B), M2M-100 (418M), and MarianMT (77M). To overcome hardware constraints, we employed parameter-efficient fine-tuning techniques (QLoRA) and extensive data filtering using a domain-specific BERT-based classifier. Our experiments demonstrate that the presence of the target language (Bashkir) in the model’s pre-training data is the decisive factor for performance. Our best system, a fine-tuned NLLB-200-1.3B model augmented with exact match retrieval, achieved a CHRF++ score of 52.67. We also report on negative results with custom tokenization for smaller models, providing insights into the limitations of vocabulary adaptation without extensive pre-training.
Anthology ID:
2026.loresmt-1.18
Volume:
Proceedings for the Ninth Workshop on Technologies for Machine Translation of Low Resource Languages (LoResMT 2026)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Atul Kr. Ojha, Chao-hong Liu, Ekaterina Vylomova, Flammie Pirinen, Jonathan Washington, Nathaniel Oco, Xiaobing Zhao
Venues:
LoResMT | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
209–212
Language:
URL:
https://aclanthology.org/2026.loresmt-1.18/
DOI:
Bibkey:
Cite (ACL):
Vyacheslav Tyurin. 2026. DevLake at LoResMT 2026: The Impact of Pre-training and Model Scale on Russian-Bashkir Low-Resource Translation. In Proceedings for the Ninth Workshop on Technologies for Machine Translation of Low Resource Languages (LoResMT 2026), pages 209–212, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
DevLake at LoResMT 2026: The Impact of Pre-training and Model Scale on Russian-Bashkir Low-Resource Translation (Tyurin, LoResMT 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.loresmt-1.18.pdf