Continual Learning for Neural Machine Translation

Yue Cao, Hao-Ran Wei, Boxing Chen, Xiaojun Wan


Abstract
Neural machine translation (NMT) models are data-driven and require large-scale training corpus. In practical applications, NMT models are usually trained on a general domain corpus and then fine-tuned by continuing training on the in-domain corpus. However, this bears the risk of catastrophic forgetting that the performance on the general domain is decreased drastically. In this work, we propose a new continual learning framework for NMT models. We consider a scenario where the training is comprised of multiple stages and propose a dynamic knowledge distillation technique to alleviate the problem of catastrophic forgetting systematically. We also find that the bias exists in the output linear projection when fine-tuning on the in-domain corpus, and propose a bias-correction module to eliminate the bias. We conduct experiments on three representative settings of NMT application. Experimental results show that the proposed method achieves superior performance compared to baseline models in all settings.
Anthology ID:
2021.naacl-main.310
Volume:
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
June
Year:
2021
Address:
Online
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3964–3974
Language:
URL:
https://aclanthology.org/2021.naacl-main.310
DOI:
10.18653/v1/2021.naacl-main.310
Bibkey:
Cite (ACL):
Yue Cao, Hao-Ran Wei, Boxing Chen, and Xiaojun Wan. 2021. Continual Learning for Neural Machine Translation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3964–3974, Online. Association for Computational Linguistics.
Cite (Informal):
Continual Learning for Neural Machine Translation (Cao et al., NAACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.naacl-main.310.pdf