融合确定性因子及区域密度的k-最近邻机器翻译方法(A k-Nearest-Neighbor Machine Translation Method Combining Certainty Factor and Region Density)

Qi Rui (齐睿), Shi Xiangyu (石响宇), Man Zhibo (满志博), Xu Jinan (徐金安), Chen Yufeng (陈钰枫)


Abstract
“k-最近邻机器翻译(kNN-MT)是近年来神经机器翻译领域的一个重要研究方向。此类方法可以在不更新机器翻译模型的情况下提高翻译质量,但训练数据中高低频单词的数量不均衡限制了模型效果,且固定的k值无法对处于不同密度分布的数据都产生良好的翻译结果。为此本文提出了一种创新的kNN-MT方法,引入确定性因子(CF)来降低数据不均衡对模型效果的影响,并根据测试点周边数据密度动态选择k值。在多领域德-英翻译数据集上,相比基线实验,本方法在四个领域上翻译效果均有提升,其中三个领域上提升超过1个BLEU,有效提高了神经机器翻译模型的翻译质量。”
Anthology ID:
2024.ccl-1.16
Volume:
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)
Month:
July
Year:
2024
Address:
Taiyuan, China
Editors:
Maosong Sun, Jiye Liang, Xianpei Han, Zhiyuan Liu, Yulan He
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
217–229
Language:
Chinese
URL:
https://aclanthology.org/2024.ccl-1.16/
DOI:
Bibkey:
Cite (ACL):
Qi Rui, Shi Xiangyu, Man Zhibo, Xu Jinan, and Chen Yufeng. 2024. 融合确定性因子及区域密度的k-最近邻机器翻译方法(A k-Nearest-Neighbor Machine Translation Method Combining Certainty Factor and Region Density). In Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference), pages 217–229, Taiyuan, China. Chinese Information Processing Society of China.
Cite (Informal):
融合确定性因子及区域密度的k-最近邻机器翻译方法(A k-Nearest-Neighbor Machine Translation Method Combining Certainty Factor and Region Density) (Rui et al., CCL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.ccl-1.16.pdf