Ana Guerberof-Arenas

Also published as: Ana Guerberof Arenas, Ana Guerberof Arenas

2025

To MT or not to MT: An eye-tracking study on the reception by Dutch readers of different translation and creativity levels
Kyo Gerrits | Ana Guerberof Arenas
Proceedings of Machine Translation Summit XX: Volume 1

This article presents the results of a pilot study involving the reception of a fictional short story translated from English into Dutch under four conditions: machine translation (MT), post-editing (PE), human translation (HT) and original source text (ST). The aim is to understand how creativity and errors in different translation modalities affect readers, specifically regarding cognitive load. Eight participants filled in a questionnaire, read a story using an eye-tracker, and conducted a retrospective think-aloud (RTA) interview. The results show that units of creative potential (UCP) increase cognitive load and that this is the highest in HT and the lowest in MT; no effect of error was observed. Triangulating the data with RTAs leads us to hypothesize that the higher cognitive load in UCPs is linked to increases in reader enjoyment and immersion. The effect of translation creativity on cognitive load in different translation modalities at word-level is novel and opens up new avenues for further research.

pdf bib

Proceedings of the Second Workshop on Creative-text Translation and Technology (CTT)
Bram Vanroy | Marie-Aude Lefer | Lieve Macken | Paola Ruffo | Ana Guerberof Arenas | Damien Hansen
Proceedings of the Second Workshop on Creative-text Translation and Technology (CTT)

pdf bib abs

Optimising ChatGPT for creativity in literary translation: A case study from English into Dutch, Chinese, Catalan and Spanish
Shuxiang Du | Ana Guerberof Arenas | Antonio Toral | Kyo Gerrits | Josep Marco Borillo
Proceedings of Machine Translation Summit XX: Volume 1

This study examines the variability of ChatGPT’s machine translation (MT) outputs across six different configurations in four languages, with a focus on creativity in a literary text. We evaluate GPT translations in different text granularity levels, temperature settings and prompting strategies with a Creativity Score formula. We found that prompting ChatGPT with a minimal instruction yields the best creative translations, with Translate the following text into [TG] creatively at the temperature of 1.0 outperforming other configurations and DeepL in Spanish, Dutch, and Chinese. Nonetheless, ChatGPT consistently underperforms compared to human translation (HT). All the code and data are available at Repository URL will be provided with camera-ready version.

pdf bib abs

Word-level quality estimation (QE) methods aim to detect erroneous spans in machine translations, which can direct and facilitate human post-editing. While the accuracy of word-level QE systems has been assessed extensively, their usability and downstream influence on the speed, quality, and editing choices of human post-editing remain understudied. In this study, we investigate the impact of word-level QE on machine translation (MT) post-editing in a realistic setting involving 42 professional post-editors across two translation directions. We compare four error-span highlight modalities, including supervised and uncertainty-based word-level QE methods, for identifying potential errors in the outputs of a state-of-the-art neural MT model. Post-editing effort and productivity are estimated from behavioral logs, while quality improvements are assessed by word- and segment-level human annotation. We find that domain, language and editors’ speed are critical factors in determining highlights’ effectiveness, with modest differences between human-made and automated QE highlights underlining a gap between accuracy and usability in professional workflows.

2024

pdf bib abs

What the Harm? Quantifying the Tangible Impact of Gender Bias in Machine Translation with a Human-centered Study
Beatrice Savoldi | Sara Papi | Matteo Negri | Ana Guerberof-Arenas | Luisa Bentivogli
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Gender bias in machine translation (MT) is recognized as an issue that can harm people and society. And yet, advancements in the field rarely involve people, the final MT users, or inform how they might be impacted by biased technologies. Current evaluations are often restricted to automatic methods, which offer an opaque estimate of what the downstream impact of gender disparities might be. We conduct an extensive human-centered study to examine if and to what extent bias in MT brings harms with tangible costs, such as quality of service gaps across women and men. To this aim, we collect behavioral data from ~90 participants, who post-edited MT outputs to ensure correct gender translation. Across multiple datasets, languages, and types of users, our study shows that feminine post-editing demands significantly more technical and temporal effort, also corresponding to higher financial costs. Existing bias measurements, however, fail to reflect the found disparities. Our findings advocate for human-centered approaches that can inform the societal impact of bias.

pdf bib abs

INCREC: Uncovering the creative process of translated content using machine translation
Ana Guerberof-Arenas
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 2)

The INCREC project aims to uncover professional translators’ creative stages to understand how technology can be best applied to the translation of literary and audio-visual texts, and to analyse the impact of these processes on readers and viewers. To better understand this process, INCREC triangulates data from eye-tracking, retrospective think-aloud inter-views, translated material, and questionnaires from professional translators and users.

LT-LiDER is an Erasmus+ cooperation project with two main aims. The first is to map the landscape of technological capabilities required to work as a language and/or translation expert in the digitalised and datafied language industry. The second is to generate training outputs that will help language and translation trainers improve their skills and adopt appropriate pedagogical approaches and strategies for integrating data-driven technology into their language or translation classrooms, with a focus on digital and AI literacy.

2023

pdf bib abs

Migrant communities living in the Netherlands and their use of MT in healthcare settings
Susana Valdez | Ana Guerberof Arenas | Kars Ligtenberg
Proceedings of the 24th Annual Conference of the European Association for Machine Translation

As part of a larger project on the use of MT in healthcare settings among migrant communities, this paper investigates if, when, how and with what (potential) challenges migrants use MT based on a survey of 201 non-native speakers of Dutch currently living in the Netherlands. Three main findings stand out from our analysis. First, most migrants use MT to understand health information in Dutch and communicate with health professionals. How MT is used and received varies depending on the context and the L2 language level, as well as age, but not on the educational level. Second, some users face challenges of different kinds, including a lack of trust or perceived inaccuracies. Some of these challenges are related to comprehension, which brings us to our third point. We argue that a more nuanced understanding of medical translation is needed in expert-to-non-expert health communication. This questionnaire helped us identify several topics we hope to explore in the project’s next phase.

2022

pdf bib abs

CREAMT: Creativity and narrative engagement of literary texts translated by translators and NMT
Ana Guerberof Arenas | Antonio Toral
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation

We present here the EU-funded project CREAMT that seeks to understand what is meant by creativity in different translation modalities, e.g. machine translation, post-editing or professional translation. Focusing on the textual elements that determine creativity in translated literary texts and the reader experience, CREAMT uses a novel, interdisciplinary approach to assess how effective MT is in literary translation considering creativity in translation and the ultimate user: the reader.

pdf bib abs

DivEMT: Neural Machine Translation Post-Editing Effort Across Typologically Diverse Languages
Gabriele Sarti | Arianna Bisazza | Ana Guerberof-Arenas | Antonio Toral
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

We introduce DivEMT, the first publicly available post-editing study of Neural Machine Translation (NMT) over a typologically diverse set of target languages. Using a strictly controlled setup, 18 professional translators were instructed to translate or post-edit the same set of English documents into Arabic, Dutch, Italian, Turkish, Ukrainian, and Vietnamese. During the process, their edits, keystrokes, editing times and pauses were recorded, enabling an in-depth, cross-lingual evaluation of NMT quality and post-editing effectiveness. Using this new dataset, we assess the impact of two state-of-the-art NMT systems, Google Translate and the multilingual mBART-50 model, on translation productivity. We find that post-editing is consistently faster than translation from scratch. However, the magnitude of productivity gains varies widely across systems and languages, highlighting major disparities in post-editing effectiveness for languages at different degrees of typological relatedness to English, even when controlling for system architecture and training data size. We publicly release the complete dataset including all collected behavioral data, to foster new research on the translation capabilities of NMT systems for typologically diverse languages.

2019

pdf bib

What is the impact of raw MT on Japanese users of Word: preliminary results of a usability study using eye-tracking
Ana Guerberof Arenas | Joss Moorkens | Sharon O’Brien
Proceedings of Machine Translation Summit XVII: Research Track

2018

pdf bib abs

Reading Comprehension of Machine Translation Output: What Makes for a Better Read?
Sheila Castilho | Ana Guerberof Arenas
Proceedings of the 21st Annual Conference of the European Association for Machine Translation

This paper reports on a pilot experiment that compares two different machine translation (MT) paradigms in reading comprehension tests. To explore a suitable methodology, we set up a pilot experiment with a group of six users (with English, Spanish and Simplified Chinese languages) using an English Language Testing System (IELTS), and an eye-tracker. The users were asked to read three texts in their native language: either the original English text (for the English speakers) or the machine-translated text (for the Spanish and Simplified Chinese speakers). The original texts were machine-translated via two MT systems: neural (NMT) and statistical (SMT). The users were also asked to rank satisfaction statements on a 3-point scale after reading each text and answering the respective comprehension questions. After all tasks were completed, a post-task retrospective interview took place to gather qualitative data. The findings suggest that the users from the target languages completed more tasks in less time with a higher level of satisfaction when using translations from the NMT system.

Venues

Fix author