Mitigating Translationese with GPT-4: Strategies and Performance

Maria Kunilovskaya, Koel Dutta Chowdhury, Heike Przybyl, Cristina España-Bonet, Josef Genabith


Abstract
Translations differ in systematic ways from texts originally authored in the same language.These differences, collectively known as translationese, can pose challenges in cross-lingual natural language processing: models trained or tested on translated input might struggle when presented with non-translated language. Translationese mitigation can alleviate this problem. This study investigates the generative capacities of GPT-4 to reduce translationese in human-translated texts. The task is framed as a rewriting process aimed at modified translations indistinguishable from the original text in the target language. Our focus is on prompt engineering that tests the utility of linguistic knowledge as part of the instruction for GPT-4. Through a series of prompt design experiments, we show that GPT4-generated revisions are more similar to originals in the target language when the prompts incorporate specific linguistic instructions instead of relying solely on the model’s internal knowledge. Furthermore, we release the segment-aligned bidirectional German-English data built from the Europarl corpus that underpins this study.
Anthology ID:
2024.eamt-1.35
Volume:
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1)
Month:
June
Year:
2024
Address:
Sheffield, UK
Editors:
Carolina Scarton, Charlotte Prescott, Chris Bayliss, Chris Oakley, Joanna Wright, Stuart Wrigley, Xingyi Song, Edward Gow-Smith, Rachel Bawden, Víctor M Sánchez-Cartagena, Patrick Cadwell, Ekaterina Lapshinova-Koltunski, Vera Cabarrão, Konstantinos Chatzitheodorou, Mary Nurminen, Diptesh Kanojia, Helena Moniz
Venue:
EAMT
SIG:
Publisher:
European Association for Machine Translation (EAMT)
Note:
Pages:
411–430
Language:
URL:
https://aclanthology.org/2024.eamt-1.35
DOI:
Bibkey:
Cite (ACL):
Maria Kunilovskaya, Koel Dutta Chowdhury, Heike Przybyl, Cristina España-Bonet, and Josef Genabith. 2024. Mitigating Translationese with GPT-4: Strategies and Performance. In Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1), pages 411–430, Sheffield, UK. European Association for Machine Translation (EAMT).
Cite (Informal):
Mitigating Translationese with GPT-4: Strategies and Performance (Kunilovskaya et al., EAMT 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.eamt-1.35.pdf