Analysing Translation Artifacts: A Comparative Study of LLMs, NMTs, and Human Translations

Fedor Sizov; Cristina España-Bonet; Josef van Genabith; Roy Xie; Koel Dutta Chowdhury

doi:10.18653/v1/2024.wmt-1.116

Analysing Translation Artifacts: A Comparative Study of LLMs, NMTs, and Human Translations

Fedor Sizov, Cristina España-Bonet, Josef Van Genabith, Roy Xie, Koel Dutta Chowdhury

Abstract

Translated texts exhibit a range of characteristics that make them appear distinct from texts originally written in the same target language. With the rise of Large Language Models (LLMs), which are designed for a wide range of language generation and understanding tasks, there has been significant interest in their application to Machine Translation. While several studies have focused on improving translation quality through fine-tuning or few-shot prompting techniques, there has been limited exploration of how LLM-generated translations qualitatively differ from those produced by Neural Machine Translation (NMT) models, and human translations. Our study employs explainability methods such as Leave-One-Out (LOO) and Integrated Gradients (IG) to analyze the lexical features distinguishing human translations from those produced by LLMs and NMT systems. Specifically, we apply a two-stage approach: first, classifying texts based on their origin – whether they are original or translations – and second, extracting significant lexical features (highly attributed input words) using post-hoc interpretability methods. Our analysis shows that different methods of feature extraction vary in their effectiveness, with LOO being generally better at pinpointing critical input words and IG capturing a broader range of important words. Finally, our results show that while LLMs and NMT systems can produce translations of a good quality, they still differ from texts originally written by native speakers. Specifically, we find that while some LLMs often align closely with human translations, traditional NMT systems exhibit distinct characteristics, particularly in their use of certain linguistic features.

Anthology ID:: 2024.wmt-1.116
Volume:: Proceedings of the Ninth Conference on Machine Translation
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
Venues:: WMT | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1183–1199
Language:
URL:: https://aclanthology.org/2024.wmt-1.116/
DOI:: 10.18653/v1/2024.wmt-1.116
Bibkey:
Cite (ACL):: Fedor Sizov, Cristina España-Bonet, Josef Van Genabith, Roy Xie, and Koel Dutta Chowdhury. 2024. Analysing Translation Artifacts: A Comparative Study of LLMs, NMTs, and Human Translations. In Proceedings of the Ninth Conference on Machine Translation, pages 1183–1199, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: Analysing Translation Artifacts: A Comparative Study of LLMs, NMTs, and Human Translations (Sizov et al., WMT 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.wmt-1.116.pdf

PDF Cite Search Fix data