Data-Centric Approach at the LoResMT 2026 Turkic Translation Challenge: Russian-Kyrgyz

Dmitry Novokshanov


Abstract
We describe our submission to the Turkic languages translation challenge at LoResMT 2026, which focuses on translation from Russian into Kyrgyz. Our approach leverages parallel data, synthetic translations, a comprehensive filtering pipeline and a four-stage curriculum learning strategy. We compare our system with contemporary baselines and present the model that achieves a chrF++ score of 49.1 and takes first place in the competition.
Anthology ID:
2026.loresmt-1.22
Volume:
Proceedings for the Ninth Workshop on Technologies for Machine Translation of Low Resource Languages (LoResMT 2026)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Atul Kr. Ojha, Chao-hong Liu, Ekaterina Vylomova, Flammie Pirinen, Jonathan Washington, Nathaniel Oco, Xiaobing Zhao
Venues:
LoResMT | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
225–230
Language:
URL:
https://aclanthology.org/2026.loresmt-1.22/
DOI:
Bibkey:
Cite (ACL):
Dmitry Novokshanov. 2026. Data-Centric Approach at the LoResMT 2026 Turkic Translation Challenge: Russian-Kyrgyz. In Proceedings for the Ninth Workshop on Technologies for Machine Translation of Low Resource Languages (LoResMT 2026), pages 225–230, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
Data-Centric Approach at the LoResMT 2026 Turkic Translation Challenge: Russian-Kyrgyz (Novokshanov, LoResMT 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.loresmt-1.22.pdf