Gradient-based Gradual Pruning for Language-Specific Multilingual Neural Machine Translation

Dan He; Minh Quang Pham; Thanh-Le Ha; Marco Turchi

doi:10.18653/v1/2023.emnlp-main.43

Gradient-based Gradual Pruning for Language-Specific Multilingual Neural Machine Translation

Dan He, Minh-Quang Pham, Thanh-Le Ha, Marco Turchi

Abstract

Multilingual neural machine translation (MNMT) offers the convenience of translating between multiple languages with a single model. However, MNMT often suffers from performance degradation in high-resource languages compared to bilingual counterparts. This degradation is commonly attributed to parameter interference, which occurs when parameters are fully shared across all language pairs. In this work, to tackle this issue we propose a gradient-based gradual pruning technique for MNMT. Our approach aims to identify an optimal sub-network for each language pair within the multilingual model by leveraging gradient-based information as pruning criterion and gradually increasing the pruning ratio as schedule. Our approach allows for partial parameter sharing across language pairs to alleviate interference, and each pair preserves its unique parameters to capture language-specific information. Comprehensive experiments on IWSLT and WMT datasets show that our approach yields a notable performance gain on both datasets.

Anthology ID:: 2023.emnlp-main.43
Volume:: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 654–670
Language:
URL:: https://aclanthology.org/2023.emnlp-main.43/
DOI:: 10.18653/v1/2023.emnlp-main.43
Bibkey:
Cite (ACL):: Dan He, Minh-Quang Pham, Thanh-Le Ha, and Marco Turchi. 2023. Gradient-based Gradual Pruning for Language-Specific Multilingual Neural Machine Translation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 654–670, Singapore. Association for Computational Linguistics.
Cite (Informal):: Gradient-based Gradual Pruning for Language-Specific Multilingual Neural Machine Translation (He et al., EMNLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.emnlp-main.43.pdf
Video:: https://aclanthology.org/2023.emnlp-main.43.mp4

PDF Cite Search Video Fix data