Workshop on Creative-text Translation and Technology (2024)


up

pdf (full)
bib (full)
Proceedings of the 1st Workshop on Creative-text Translation and Technology

pdf bib
Proceedings of the 1st Workshop on Creative-text Translation and Technology
Bram Vanroy | Marie-Aude Lefer | Lieve Macken | Paola Ruffo

pdf bib
Using a multilingual literary parallel corpus to train NMT systems
Bojana Mikelenić | Antoni Oliver

This article presents an application of a multilingual and multidirectional parallel corpus composed of literary texts in five Romance languages (Spanish, French, Italian, Portuguese, Romanian) and a Slavic language (Croatian), with a total of 142,000 segments and 15.7 million words. After combining it with very large freely available parallel corpora, this resource is used to train NMT systems tailored to literature. A total of five NMT systems have been trained: Spanish-French, Spanish-Italian, Spanish-Portuguese, Spanish-Romanian and Spanish-Croatian. The trained systems were evaluated using automatic metrics (BLEU, chrF2 and TER) and a comparison with a rule-based MT system (Apertium) and a neural system (Google Translate) is presented. As a main conclusion, we can highlight that the use of this literary corpus has been very productive, as the majority of the trained systems achieve comparable, and in some cases even better, values of the automatic quality metrics than a widely used commercial NMT system.

pdf bib
‘Can make mistakes’. Prompting ChatGPT to Enhance Literary MT output
Gys-Walt Egdom | Christophe Declercq | Onno Kosters

Operating at the intersection of generative AI (artificial intelligence), machine transla-tion (MT), and literary translation, this paper examines to what extent prompt-driven post-editing (PE) can enhance the quality of ma-chine-translated literary texts. We assess how different types of instruction influence PE performance, particularly focusing on lit-erary nuances and author-specific styles. Situated within posthumanist translation theory, which often challenges traditional notions of human intervention in translation processes, the study explores the practical implementation of generative AI in multilin-gual workflows. While the findings suggest that prompted PE can improve translation output to some extent, its effectiveness var-ies, especially in literary contexts. This highlights the need for a critical review of prompt engineering approaches and empha-sizes the importance of further research to navigate the complexities of integrating AI into creative translation workflows effective-ly.

pdf bib
LitPC: A set of tools for building parallel corporafrom literary works
Antoni Oliver | Sergi Alvarez-Vidal

In this paper, we describe the LitPC toolkit, a variety of tools and methods designed for the quick and effective creation of parallel corpora derived from literary works. This toolkit can be a useful resource due to the scarcity of curated parallel texts for this domain. We also feature a case study describing the creation of a Russian-English parallel corpus based on the literary works by Leo Tolstoy. Furthermore, an augmented version of this corpus is used to both train and assess neural machine translation systems specifically adapted to the author’s style.

pdf bib
Prompting Large Language Models for Idiomatic Translation
Antonio Castaldo | Johanna Monti

Large Language Models (LLMs) have demonstrated impressive performance in translating content across different languages and genres. Yet, their potential in the creative aspects of machine translation has not been fully explored. In this paper, we seek to identify the strengths and weaknesses inherent in different LLMs when applied to one of the most prominent features of creative works: the translation of idiomatic expressions. We present an overview of their performance in the ENIT language pair, a context characterized by an evident lack of bilingual data tailored for idiomatic translation. Lastly, we investigate the impact of prompt design on the quality of machine translation, drawing on recent findings which indicate a substantial variation in the performance of LLMs depending on the prompts utilized.

pdf bib
An Analysis of Surprisal Uniformity in Machine and Human Translations
Josef Jon | Ondřej Bojar

This study examines neural machine translation (NMT) and its performance on texts that diverege from typical standards, focusing on how information is organized within sentences. We analyze surprisal distributions in source texts, human translations, and machine translations across several datasets to determine if NMT systems naturally promote a uniform density of surprisal in their translations, even when the original texts do not adhere to this principle.The findings reveal that NMT tends to align more closely with source texts in terms of surprisal uniformity compared to human translations.We analyzed absolute values of the surprisal uniformity measures as well, expecting that human translations will be less uniform. In contradiction to our initial hypothesis, we did not find comprehensive evidence for this claim, with some results suggesting this might be the case for very diverse texts, like poetry.

pdf bib
Impact of translation workflows with and without MT on textual characteristics in literary translation
Joke Daems | Paola Ruffo | Lieve Macken

The use of machine translation is increasingly being explored for the translation of literary texts, but there is still a lot of uncertainty about the optimal translation workflow in these scenarios. While overall quality is quite good, certain textual characteristics can be different in a human translated text and a text produced by means of machine translation post-editing, which has been shown to potentially have an impact on reader perceptions and experience as well. In this study, we look at textual characteristics from short story translations from B.J. Novak’s One more thing into Dutch. Twenty-three professional literary translators translated three short stories, in three different conditions: using Word, using the classic CAT tool Trados, and using a machine translation post-editing platform specifically designed for literary translation. We look at overall text characteristics (sentence length, type-token ratio, stylistic differences) to establish whether translation workflow has an impact on these features, and whether the three workflows lead to very different final translations or not.

pdf bib
Machine Translation Meets Large Language Models: Evaluating ChatGPT’s Ability to Automatically Post-Edit Literary Texts
Lieve Macken

Large language models such as GPT-4 have been trained on vast corpora, giving them excellent language understanding. This study explores the use of ChatGPT for post-editing machine translations of literary texts. Three short stories, machine translated from English into Dutch, were post-edited by 7-8 professional translators and ChatGPT. Automatic metrics were used to evaluate the number and type of edits made, and semantic and syntactic similarity between the machine translation and the corresponding post-edited versions. A manual analysis classified errors in the machine translation and changes made by the post-editors. The results show that ChatGPT made more changes than the average post-editor. ChatGPT improved lexical richness over machine translation for all texts. The analysis of editing types showed that ChatGPT replaced more words with synonyms, corrected fewer machine errors and introduced more problems than professionals.