Can Large Language Models Learn Translation Robustness from Noisy-Source In-context Demonstrations?

Leiyu Pan, Yongqi Leng, Deyi Xiong


Abstract
Large language models (LLMs) have been used for machine translation. When provided with prompts and source sentences, LLMs can achieve impressive translation results. However, the robustness of these LLMs remains a significant challenge, as they often struggle to accurately translate sentences in the presence of noise, even when using similarity-based in-context learning methods. This work proposes a research scheme for studying machine translation robustness on LLMs, investigating whether LLMs can learn translation robustness from noisy-source demonstration examples. Through experiments on different models, languages, and noise types, we empirically demonstrate that LLMs can learn how to handle noise and translation methods from noisy-source demonstration examples, thereby improving their translation performance on noisy sentences. Furthermore, we find that increasing the noise ratio appropriately for the noisy-source demonstration examples can enhance the translation robustness of LLMs. Additionally, we also attempt to investigate scenarios where LLMs are more likely to learn translation robustness for mixed and specific types of noise. We find that the model’s performance varies across different noise settings.
Anthology ID:
2024.lrec-main.249
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
2798–2808
Language:
URL:
https://aclanthology.org/2024.lrec-main.249
DOI:
Bibkey:
Cite (ACL):
Leiyu Pan, Yongqi Leng, and Deyi Xiong. 2024. Can Large Language Models Learn Translation Robustness from Noisy-Source In-context Demonstrations?. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 2798–2808, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Can Large Language Models Learn Translation Robustness from Noisy-Source In-context Demonstrations? (Pan et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.249.pdf