Incorporating translation quality estimation into Chinese-Korean neural machine translation

Li Feiyu, Zhao Yahui, Yang Feiyang, Cui Rongyi


Abstract
Exposure bias and poor translation diversity are two common problems in neural machine trans-lation (NMT) which are caused by the general of the teacher forcing strategy for training inthe NMT models. Moreover the NMT models usually require the large-scale and high-quality parallel corpus. However Korean is a low resource language and there is no large-scale parallel corpus between Chinese and Korean which is a challenging for the researchers. Therefore wepropose a method which is to incorporate translation quality estimation into the translation processand adopt reinforcement learning. The evaluation mechanism is used to guide the training of the model so that the prediction cannot converge completely to the ground truth word. When the model predicts a sequence different from the ground truth word the evaluation mechanism cangive an appropriate evaluation and reward to the model. In addition we alleviated the lack of Korean corpus resources by adding training data. In our experiment we introduce a monolingual corpus of a certain scale to construct pseudo-parallel data. At the same time we also preprocessed the Korean corpus with different granularities to overcome the data sparsity. Experimental results show that our work is superior to the baselines in Chinese-Korean and Korean-Chinese translation tasks which fully certificates the effectiveness of our method.
Anthology ID:
2021.ccl-1.81
Volume:
Proceedings of the 20th Chinese National Conference on Computational Linguistics
Month:
August
Year:
2021
Address:
Huhhot, China
Editors:
Sheng Li (李生), Maosong Sun (孙茂松), Yang Liu (刘洋), Hua Wu (吴华), Kang Liu (刘康), Wanxiang Che (车万翔), Shizhu He (何世柱), Gaoqi Rao (饶高琦)
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
906–915
Language:
English
URL:
https://aclanthology.org/2021.ccl-1.81
DOI:
Bibkey:
Cite (ACL):
Li Feiyu, Zhao Yahui, Yang Feiyang, and Cui Rongyi. 2021. Incorporating translation quality estimation into Chinese-Korean neural machine translation. In Proceedings of the 20th Chinese National Conference on Computational Linguistics, pages 906–915, Huhhot, China. Chinese Information Processing Society of China.
Cite (Informal):
Incorporating translation quality estimation into Chinese-Korean neural machine translation (Feiyu et al., CCL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.ccl-1.81.pdf