Benchmarking Low-Resource Machine Translation Systems

Ana Silva, Nikit Srivastava, Tatiana Moteu Ngoli, Michael Röder, Diego Moussallem, Axel-Cyrille Ngonga Ngomo


Abstract
Assessing the performance of machine translation systems is of critical value, especially to languages with lower resource availability.Due to the large evaluation effort required by the translation task, studies often compare new systems against single systems or commercial solutions. Consequently, determining the best-performing system for specific languages is often unclear. This work benchmarks publicly available translation systems across 4 datasets and 26 languages, including low-resource languages. We consider both effectiveness and efficiency in our evaluation.Our results are made public through BENG—a FAIR benchmarking platform for Natural Language Generation tasks.
Anthology ID:
2024.loresmt-1.18
Volume:
Proceedings of the The Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2024)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Atul Kr. Ojha, Chao-hong Liu, Ekaterina Vylomova, Flammie Pirinen, Jade Abbott, Jonathan Washington, Nathaniel Oco, Valentin Malykh, Varvara Logacheva, Xiaobing Zhao
Venues:
LoResMT | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
175–185
Language:
URL:
https://aclanthology.org/2024.loresmt-1.18
DOI:
Bibkey:
Cite (ACL):
Ana Silva, Nikit Srivastava, Tatiana Moteu Ngoli, Michael Röder, Diego Moussallem, and Axel-Cyrille Ngonga Ngomo. 2024. Benchmarking Low-Resource Machine Translation Systems. In Proceedings of the The Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2024), pages 175–185, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Benchmarking Low-Resource Machine Translation Systems (Silva et al., LoResMT-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.loresmt-1.18.pdf