2024
pdf
bib
abs
ICL: Iterative Continual Learning for Multi-domain Neural Machine Translation
Zhibo Man
|
Kaiyu Huang
|
Yujie Zhang
|
Yuanmeng Chen
|
Yufeng Chen
|
Jinan Xu
Findings of the Association for Computational Linguistics: EMNLP 2024
In a practical scenario, multi-domain neural machine translation (MDNMT) aims to continuously acquire knowledge from new domain data while retaining old knowledge. Previous work separately learns each new domain knowledge based on parameter isolation methods, which effectively capture the new knowledge. However, task-specific parameters lead to isolation between models, which hinders the mutual transfer of knowledge between new domains. Given the scarcity of domain-specific corpora, we consider making full use of the data from multiple new domains. Therefore, our work aims to leverage previously acquired domain knowledge when modeling subsequent domains. To this end, we propose an Iterative Continual Learning (ICL) framework for multi-domain neural machine translation. Specifically, when each new domain arrives, (1) we first build a pluggable incremental learning model, (2) then we design an iterative updating algorithm to continuously update the original model, which can be used flexibly for constructing subsequent domain models. Furthermore, we design a domain knowledge transfer mechanism to enhance the fine-grained domain-specific representation, thereby solving the word ambiguity caused by mixing domain data. Experimental results on the UM-Corpus and OPUS multi-domain datasets show the superior performance of our proposed model compared to representative baselines.
2023
pdf
bib
abs
Exploring Domain-shared and Domain-specific Knowledge in Multi-Domain Neural Machine Translation
Zhibo Man
|
Yujie Zhang
|
Yuanmeng Chen
|
Yufeng Chen
|
Jinan Xu
Proceedings of Machine Translation Summit XIX, Vol. 1: Research Track
Currently, multi-domain neural machine translation (NMT) has become a significant research topic in domain adaptation machine translation, which trains a single model by mixing data from multiple domains. Multi-domain NMT aims to improve the performance of the low-resources domain through data augmentation. However, mixed domain data brings more translation ambiguity. Previous work focused on domain-general or domain-context knowledge learning, respectively. Therefore, there is a challenge for acquiring domain-general or domain-context knowledge simultaneously. To this end, we propose a unified framework for learning simultaneously domain-general and domain-specific knowledge, we are the first to apply parameter differentiation in multi-domain NMT. Specifically, we design the differentiation criterion and differentiation granularity to obtain domain-specific parameters. Experimental results on multi-domain UM-corpus English-to-Chinese and OPUS German-to-English datasets show that the average BLEU scores of the proposed method exceed the strong baseline by 1.22 and 1.87, respectively. In addition, we investigate the case study to illustrate the effectiveness of the proposed method in acquiring domain knowledge.
2021
pdf
bib
abs
基于模型不确定性约束的半监督汉缅神经机器翻译(Semi-Supervised Chinese-Myanmar Neural Machine Translation based Model-Uncertainty)
Linqin Wang (王琳钦)
|
Zhengtao Yu (余正涛)
|
Cunli Mao (毛存礼)
|
Chengxiang Gao (高盛祥)
|
Zhibo Man (满志博)
|
Zhenhan Wang (王振晗)
Proceedings of the 20th Chinese National Conference on Computational Linguistics
基于回译的半监督神经机器翻译方法在低资源神经机器翻译取得了明显的效果,然而,由于汉缅双语资源稀缺、结构差异较大,传统基于Transformer的回译方法中编码端的Self-attention机制不能有效区别回译中产生的伪平行数据的噪声对句子编码的影响,致使译文出现漏译,多译,错译等问题。为此,该文提出基于模型不确定性为约束的半监督汉缅神经机器翻译方法,在Transformer网络中利用基于变分推断的蒙特卡洛Dropout构建模型不确定性注意力机制,获取到能够区分噪声数据的句子向量表征,在此基础上与Self-attention机制得到的句子编码向量进行融合,以此得到句子有效编码表征。实验证明,本文方法相比传统基于Transformer的回译方法在汉语-缅甸语和缅甸语-汉语两个翻译方向BLEU值分别提升了4.01和1.88个点,充分验证了该方法在汉缅神经翻译任务的有效性。
2020
pdf
bib
abs
基于多语言联合训练的汉-英-缅神经机器翻译方法(Chinese-English-Burmese Neural Machine Translation Method Based on Multilingual Joint Training)
Zhibo Man (满志博)
|
Cunli Mao (毛存礼)
|
Zhengtao Yu (余正涛)
|
Xunyu Li (李训宇)
|
Shengxiang Gao (高盛祥)
|
Junguo Zhu (朱俊国)
Proceedings of the 19th Chinese National Conference on Computational Linguistics
多语言神经机器翻译是解决低资源神经机器翻译的有效方法,现有方法通常依靠共享词表的方式解决英语、法语以及德语相似语言之间的多语言翻译问题。缅甸语属于一种典型的低资源语言,汉语、英语以及缅甸语之间的语言结构差异性较大,为了缓解由于差异性引起的共享词表大小受限制的问题,提出一种基于多语言联合训练的汉英缅神经机器翻译方法。在Transformer框架下将丰富的汉英平行语料与汉缅、英缅的语料进行联合训练,模型训练过程中分别在编码端和解码端将汉英缅映射在同一语义空间降低汉英缅语言结构差异性对共享词表的影响,通过共享汉英语料训练参数来弥补汉缅数据缺失的问题。实验表明在一对多、多对多的翻译场景下,提出方法相比基线模型的汉-英、英-缅以及汉-缅的BLEU值有明显的提升。