Innokentiy S. Humonen
2026
Shughni Machine Translation Enhanced by Donor Languages
Dmitry Novokshanov | Innokentiy S. Humonen | Ilya Makarov
The Proceedings of the First Workshop on NLP and LLMs for the Iranian Language Family
Dmitry Novokshanov | Innokentiy S. Humonen | Ilya Makarov
The Proceedings of the First Workshop on NLP and LLMs for the Iranian Language Family
This paper presents the first machine translation system for Shughni, an extremely lowresource Eastern Iranian language spoken in Tajikistan and Afghanistan. We fine-tune NLLB-200 models and explore auxiliary language selection through typological similarity and "super-donor" experiments. Our final Shughni–Russian model achieves a chrF++ score of 36.3 (45.7 on BivalTyp data), establishing the first computational translation resource for this language. Beyond reporting system performance, this work demonstrates a practical path toward supporting languages with virtually no prior MT resources. Our demo system with Shughni-Russian- English translation (Russian serves as a pivot language for the Shughni- English pair) is available on Hugging- Face (https://huggingface.co/spaces/Novokshanov/Shughni-Translator).