Sanjana Krishnan
2022
The Flores-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation
Naman Goyal
|
Cynthia Gao
|
Vishrav Chaudhary
|
Peng-Jen Chen
|
Guillaume Wenzek
|
Da Ju
|
Sanjana Krishnan
|
Marc’Aurelio Ranzato
|
Francisco Guzmán
|
Angela Fan
Transactions of the Association for Computational Linguistics, Volume 10
One of the biggest challenges hindering progress in low-resource and multilingual machine translation is the lack of good evaluation benchmarks. Current evaluation benchmarks either lack good coverage of low-resource languages, consider only restricted domains, or are low quality because they are constructed using semi-automatic procedures. In this work, we introduce the Flores-101 evaluation benchmark, consisting of 3001 sentences extracted from English Wikipedia and covering a variety of different topics and domains. These sentences have been translated in 101 languages by professional translators through a carefully controlled process. The resulting dataset enables better assessment of model quality on the long tail of low-resource languages, including the evaluation of many-to-many multilingual translation systems, as all translations are fully aligned. By publicly releasing such a high-quality and high-coverage dataset, we hope to foster progress in the machine translation community and beyond.
Search
Co-authors
- Naman Goyal 1
- Cynthia Gao 1
- Vishrav Chaudhary 1
- Peng-Jen Chen 1
- Guillaume Wenzek 1
- show all...
Venues
- tacl1