Continual Learning for Multilingual Neural Machine Translation via Dual Importance-based Model Division

Junpeng Liu, Kaiyu Huang, Hao Yu, Jiuyi Li, Jinsong Su, Degen Huang


Abstract
A persistent goal of multilingual neural machine translation (MNMT) is to continually adapt the model to support new language pairs or improve some current language pairs without accessing the previous training data. To achieve this, the existing methods primarily focus on preventing catastrophic forgetting by making compromises between the original and new language pairs, leading to sub-optimal performance on both translation tasks. To mitigate this problem, we propose a dual importance-based model division method to divide the model parameters into two parts and separately model the translation of the original and new tasks. Specifically, we first remove the parameters that are negligible to the original tasks but essential to the new tasks to obtain a pruned model, which is responsible for the original translation tasks. Then we expand the pruned model with external parameters and fine-tune the newly added parameters with new training data. The whole fine-tuned model will be used for the new translation tasks. Experimental results show that our method can efficiently adapt the original model to various new translation tasks while retaining the performance of the original tasks. Further analyses demonstrate that our method consistently outperforms several strong baselines under different incremental translation scenarios.
Anthology ID:
2023.emnlp-main.736
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12011–12027
Language:
URL:
https://aclanthology.org/2023.emnlp-main.736
DOI:
10.18653/v1/2023.emnlp-main.736
Bibkey:
Cite (ACL):
Junpeng Liu, Kaiyu Huang, Hao Yu, Jiuyi Li, Jinsong Su, and Degen Huang. 2023. Continual Learning for Multilingual Neural Machine Translation via Dual Importance-based Model Division. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 12011–12027, Singapore. Association for Computational Linguistics.
Cite (Informal):
Continual Learning for Multilingual Neural Machine Translation via Dual Importance-based Model Division (Liu et al., EMNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.emnlp-main.736.pdf
Video:
 https://aclanthology.org/2023.emnlp-main.736.mp4