Error Analysis of Multilingual Language Models in Machine Translation: A Case Study of English-Amharic Translation

Hizkiel Alemayehu, Hamada Zahera, Axel-Cyrille Ngonga Ngomo


Abstract
Multilingual large language models (mLLMs) have significantly advanced machine translation, yet challenges remain for low-resource languages like Amharic. This study evaluates the performance of state-of-the-art mLLMs, specifically NLLB-200 (NLLB3.3, NLLB1.3 Distilled1.3, NLB600) and M2M (M2M1.2B, M2M418), in English-Amharic bidirectional translation using the Lesan AI dataset. We employed both automatic and human evaluation methods to analyze translation errors. Automatic evaluation used BLEU, METEOR, chrF, and TER metrics, while human evaluation assessed translation quality at both word and sentence levels. Sentence-level accuracy was rated by annotators on a scale from 0 to 5, and word-level quality was evaluated using Multidimensional Quality Metrics. Our findings indicate that the NLLB3.3B model consistently outperformed other mLLMs across all evaluation methods. Common error included mistranslation, omission, untranslated segments, and additions, with mistranslation being particularly common. Punctuation and spelling errors were rare in our experiment.
Anthology ID:
2024.emnlp-main.1102
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
19758–19768
Language:
URL:
https://aclanthology.org/2024.emnlp-main.1102
DOI:
Bibkey:
Cite (ACL):
Hizkiel Alemayehu, Hamada Zahera, and Axel-Cyrille Ngonga Ngomo. 2024. Error Analysis of Multilingual Language Models in Machine Translation: A Case Study of English-Amharic Translation. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 19758–19768, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Error Analysis of Multilingual Language Models in Machine Translation: A Case Study of English-Amharic Translation (Alemayehu et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.1102.pdf