Multilingual Pre-training Meets Supervised Neural Machine Translation: A Reproducible Evaluation on English–French and Finnish Translation

Benyamin Ahmadnia, Yeswanth Soma, Hossein Sarrafzadeh


Abstract
This paper presents a comparative evaluation of Transformer-based Neural Machine Translation (NMT) models and pre-trained multilingual sequence-to-sequence models in the context of moderately-resourced MT. Using English-French (high-resource) and English-Finnish (moderate-resource) as case studies, we assess the effectiveness of fine-tuning the mBART model versus training standard NMT systems from scratch. Our experiments incorporate data-augmentation techniques such as back-translation and evaluate translation quality using BLEU, TER, METEOR, and COMET metrics. We also provide a detailed error analysis that covers lexical choice, named entity handling, and word order. While mBART demonstrates consistent improvements over classical NMT, particularly in handling complex linguistic structures and sparse training data, we acknowledge the challenges of deploying large models in resource-constrained settings. Our findings highlight practical trade-offs between model complexity, resource availability, and translation quality in multilingual scenarios.
Anthology ID:
2025.ranlp-1.5
Volume:
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Month:
September
Year:
2025
Address:
Varna, Bulgaria
Editors:
Galia Angelova, Maria Kunilovskaya, Marie Escribe, Ruslan Mitkov
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
38–47
Language:
URL:
https://aclanthology.org/2025.ranlp-1.5/
DOI:
Bibkey:
Cite (ACL):
Benyamin Ahmadnia, Yeswanth Soma, and Hossein Sarrafzadeh. 2025. Multilingual Pre-training Meets Supervised Neural Machine Translation: A Reproducible Evaluation on English–French and Finnish Translation. In Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era, pages 38–47, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Multilingual Pre-training Meets Supervised Neural Machine Translation: A Reproducible Evaluation on English–French and Finnish Translation (Ahmadnia et al., RANLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.ranlp-1.5.pdf