Towards Demonstration-Aware Large Language Models for Machine Translation

Chen Li, Meishan Zhang, Xuebo Liu, Zhaocong Li, Derek Wong, Min Zhang


Abstract
Tuning-based large language models for machine translation (aka large translation model, LTM) have demonstrated significant performance in the field of machine translation. Despite their success, these models often face difficulties in leveraging demonstrations to further improve their performance. To tackle this challenge, we introduce a novel approach that integrates demonstration-aware training and inference strategies within the framework of tuning-based LTMs, hereby referred to as demonstration-aware LTMs. During training, we enrich the model’s learning process by incorporating both sentence- and document-level demonstrations derived from its original training dataset. During inference, the model synergizes its own contextual translations with retrieved high-quality demonstrations, leading to more precise and contextually appropriate outputs. Empirical results reveal that our demonstration-aware LTM not only mitigates the negative impacts traditionally associated with demonstrations but also secures substantial improvements in translation accuracy, particularly in domain-specific and document-level translation tasks. Source code and scripts are freely available at https://github.com/ChenLi0620/Demo-Aware-LLM-MT.
Anthology ID:
2024.findings-acl.824
Volume:
Findings of the Association for Computational Linguistics ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand and virtual meeting
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
13868–13881
Language:
URL:
https://aclanthology.org/2024.findings-acl.824
DOI:
Bibkey:
Cite (ACL):
Chen Li, Meishan Zhang, Xuebo Liu, Zhaocong Li, Derek Wong, and Min Zhang. 2024. Towards Demonstration-Aware Large Language Models for Machine Translation. In Findings of the Association for Computational Linguistics ACL 2024, pages 13868–13881, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
Towards Demonstration-Aware Large Language Models for Machine Translation (Li et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.824.pdf