Enhancing Neural Machine Translation Through Target Language Data: A kNN-LM Approach for Domain Adaptation

Abudurexiti Reheman; Hongyu Liu; Junhao Ruan; Abudukeyumu Abudula; Yingfeng Luo; Tong Xiao (肖桐); Jingbo Zhu

doi:10.18653/v1/2025.acl-long.496

Enhancing Neural Machine Translation Through Target Language Data: A kNN-LM Approach for Domain Adaptation

Abudurexiti Reheman, Hongyu Liu, Junhao Ruan, Abudukeyumu Abudula, Yingfeng Luo, Tong Xiao, JingBo Zhu

Abstract

Neural machine translation (NMT) has advanced significantly, yet challenges remain in adapting to new domains . In scenarios where bilingual data is limited, this issue is further exacerbated. To address this, we propose kNN-LM-NMT, a method that leverages semantically similar target language sentences in the kNN framework. Our approach generates a probability distribution over these sentences during decoding, and this distribution is then interpolated with the NMT model’s distribution. Additionally, we introduce an n-gram-based approach to focus on similar fragments, enabling the model to avoid the noise introduced by the non-similar parts. To enhance accuracy, we further incorporate cross-lingual retrieval similarity to refine the kNN probability distribution. Extensive experiments on multi-domain datasets demonstrate significant performance improvements in both high-resource and low-resource scenarios. Our approach effectively extracts translation knowledge from limited target domain data, and well benefits from large-scale monolingual data for robust context representation.

Anthology ID:: 2025.acl-long.496
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 10053–10065
Language:
URL:: https://aclanthology.org/2025.acl-long.496/
DOI:: 10.18653/v1/2025.acl-long.496
Bibkey:
Cite (ACL):: Abudurexiti Reheman, Hongyu Liu, Junhao Ruan, Abudukeyumu Abudula, Yingfeng Luo, Tong Xiao, and JingBo Zhu. 2025. Enhancing Neural Machine Translation Through Target Language Data: A kNN-LM Approach for Domain Adaptation. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10053–10065, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Enhancing Neural Machine Translation Through Target Language Data: A kNN-LM Approach for Domain Adaptation (Reheman et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-long.496.pdf

PDF Cite Search Fix data