Assessing the Impact of Typological Features on Multilingual Machine Translation in the Age of Large Language Models

Vitalii Hirak; Jaap Jumelet; Arianna Bisazza

Assessing the Impact of Typological Features on Multilingual Machine Translation in the Age of Large Language Models

Vitalii Hirak, Jaap Jumelet, Arianna Bisazza

Abstract

Despite major advances in multilingual modeling, large quality disparities persist across languages. Besides the obvious impact of uneven training resources, typological properties have also been proposed to determine the intrinsic difficulty of modeling a language. The existing evidence, however, is mostly based on small monolingual language models or bilingual translation models trained from scratch. We expand on this line of work by analyzing two large pre-trained multilingual translation models, NLLB-200 and Tower+, which are state-of-the-art representatives of encoder-decoder and decoder-only machine translation, respectively. Based on a broad set of languages, we find that target language typology drives translation quality of both models, even after controlling for more trivial factors, such as data resourcedness and writing script. Additionally, languages with certain typological properties benefit more from a wider search of the output space, suggesting that such languages could profit from alternative decoding strategies beyond the standard left-to-right beam search. To facilitate further research in this area, we release a set of fine-grained typological properties for 212 languages of the FLORES+ MT evaluation benchmark.

Anthology ID:: 2026.eacl-long.109
Volume:: Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2416–2434
Language:
URL:: https://aclanthology.org/2026.eacl-long.109/
DOI:
Bibkey:
Cite (ACL):: Vitalii Hirak, Jaap Jumelet, and Arianna Bisazza. 2026. Assessing the Impact of Typological Features on Multilingual Machine Translation in the Age of Large Language Models. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2416–2434, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Assessing the Impact of Typological Features on Multilingual Machine Translation in the Age of Large Language Models (Hirak et al., EACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.eacl-long.109.pdf
Checklist:: 2026.eacl-long.109.checklist.pdf

PDF Cite Search Checklist Fix data