Enhancing Low-Resource NMT with a Multilingual Encoder and Knowledge Distillation: A Case Study

Aniruddha Roy; Pretam Ray; Ayush Maheshwari; Sudeshna Sarkar; Pawan Goyal

doi:10.18653/v1/2024.loresmt-1.7

Enhancing Low-Resource NMT with a Multilingual Encoder and Knowledge Distillation: A Case Study

Aniruddha Roy, Pretam Ray, Ayush Maheshwari, Sudeshna Sarkar, Pawan Goyal

Abstract

Neural Machine Translation (NMT) remains a formidable challenge, especially when dealing with low-resource languages. Pre-trained sequence-to-sequence (seq2seq) multi-lingual models, such as mBART-50, have demonstrated impressive performance in various low-resource NMT tasks. However, their pre-training has been confined to 50 languages, leaving out support for numerous low-resource languages, particularly those spoken in the Indian subcontinent. Expanding mBART-50’s language support requires complex pre-training, risking performance decline due to catastrophic forgetting. Considering these expanding challenges, this paper explores a framework that leverages the benefits of a pre-trained language model along with knowledge distillation in a seq2seq architecture to facilitate translation for low-resource languages, including those not covered by mBART-50. The proposed framework employs a multilingual encoder-based seq2seq model as the foundational architecture and subsequently uses complementary knowledge distillation techniques to mitigate the impact of imbalanced training. Our framework is evaluated on three low-resource Indic languages in four Indic-to-Indic directions, yielding significant BLEU-4 and chrF improvements over baselines. Further, we conduct human evaluation to confirm effectiveness of our approach. Our code is publicly available at https://github.com/raypretam/Two-step-low-res-NMT.

Anthology ID:: 2024.loresmt-1.7
Volume:: Proceedings of the Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2024)
Month:: August
Year:: 2024
Address:: Bangkok, Thailand
Editors:: Atul Kr. Ojha, Chao-hong Liu, Ekaterina Vylomova, Flammie Pirinen, Jade Abbott, Jonathan Washington, Nathaniel Oco, Valentin Malykh, Varvara Logacheva, Xiaobing Zhao
Venues:: LoResMT | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 64–73
Language:
URL:: https://aclanthology.org/2024.loresmt-1.7/
DOI:: 10.18653/v1/2024.loresmt-1.7
Bibkey:
Cite (ACL):: Aniruddha Roy, Pretam Ray, Ayush Maheshwari, Sudeshna Sarkar, and Pawan Goyal. 2024. Enhancing Low-Resource NMT with a Multilingual Encoder and Knowledge Distillation: A Case Study. In Proceedings of the Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2024), pages 64–73, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):: Enhancing Low-Resource NMT with a Multilingual Encoder and Knowledge Distillation: A Case Study (Roy et al., LoResMT 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.loresmt-1.7.pdf

PDF Cite Search Fix data