2024
pdf
bib
abs
What the Harm? Quantifying the Tangible Impact of Gender Bias in Machine Translation with a Human-centered Study
Beatrice Savoldi
|
Sara Papi
|
Matteo Negri
|
Ana Guerberof-Arenas
|
Luisa Bentivogli
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Gender bias in machine translation (MT) is recognized as an issue that can harm people and society. And yet, advancements in the field rarely involve people, the final MT users, or inform how they might be impacted by biased technologies. Current evaluations are often restricted to automatic methods, which offer an opaque estimate of what the downstream impact of gender disparities might be. We conduct an extensive human-centered study to examine if and to what extent bias in MT brings harms with tangible costs, such as quality of service gaps across women and men. To this aim, we collect behavioral data from ~90 participants, who post-edited MT outputs to ensure correct gender translation. Across multiple datasets, languages, and types of users, our study shows that feminine post-editing demands significantly more technical and temporal effort, also corresponding to higher financial costs. Existing bias measurements, however, fail to reflect the found disparities. Our findings advocate for human-centered approaches that can inform the societal impact of bias.
pdf
bib
abs
INCREC: Uncovering the creative process of translated content using machine translation
Ana Guerberof-Arenas
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 2)
The INCREC project aims to uncover professional translators’ creative stages to understand how technology can be best applied to the translation of literary and audio-visual texts, and to analyse the impact of these processes on readers and viewers. To better understand this process, INCREC triangulates data from eye-tracking, retrospective think-aloud inter-views, translated material, and questionnaires from professional translators and users.
pdf
bib
abs
Literacy in Digital Environments and Resources (LT-LiDER)
Joss Moorkens
|
Pilar Sánchez-Gijón
|
Esther Simon
|
Mireia Urpí
|
Nora Aranberri
|
Dragoș Ciobanu
|
Ana Guerberof-Arenas
|
Janiça Hackenbuchner
|
Dorothy Kenny
|
Ralph Krüger
|
Miguel Rios
|
Isabel Ginel
|
Caroline Rossi
|
Alina Secară
|
Antonio Toral
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 2)
LT-LiDER is an Erasmus+ cooperation project with two main aims. The first is to map the landscape of technological capabilities required to work as a language and/or translation expert in the digitalised and datafied language industry. The second is to generate training outputs that will help language and translation trainers improve their skills and adopt appropriate pedagogical approaches and strategies for integrating data-driven technology into their language or translation classrooms, with a focus on digital and AI literacy.
2023
pdf
bib
abs
Migrant communities living in the Netherlands and their use of MT in healthcare settings
Susana Valdez
|
Ana Guerberof Arenas
|
Kars Ligtenberg
Proceedings of the 24th Annual Conference of the European Association for Machine Translation
As part of a larger project on the use of MT in healthcare settings among migrant communities, this paper investigates if, when, how and with what (potential) challenges migrants use MT based on a survey of 201 non-native speakers of Dutch currently living in the Netherlands. Three main findings stand out from our analysis. First, most migrants use MT to understand health information in Dutch and communicate with health professionals. How MT is used and received varies depending on the context and the L2 language level, as well as age, but not on the educational level. Second, some users face challenges of different kinds, including a lack of trust or perceived inaccuracies. Some of these challenges are related to comprehension, which brings us to our third point. We argue that a more nuanced understanding of medical translation is needed in expert-to-non-expert health communication. This questionnaire helped us identify several topics we hope to explore in the project’s next phase.
2022
pdf
bib
abs
DivEMT: Neural Machine Translation Post-Editing Effort Across Typologically Diverse Languages
Gabriele Sarti
|
Arianna Bisazza
|
Ana Guerberof-Arenas
|
Antonio Toral
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
We introduce DivEMT, the first publicly available post-editing study of Neural Machine Translation (NMT) over a typologically diverse set of target languages. Using a strictly controlled setup, 18 professional translators were instructed to translate or post-edit the same set of English documents into Arabic, Dutch, Italian, Turkish, Ukrainian, and Vietnamese. During the process, their edits, keystrokes, editing times and pauses were recorded, enabling an in-depth, cross-lingual evaluation of NMT quality and post-editing effectiveness. Using this new dataset, we assess the impact of two state-of-the-art NMT systems, Google Translate and the multilingual mBART-50 model, on translation productivity. We find that post-editing is consistently faster than translation from scratch. However, the magnitude of productivity gains varies widely across systems and languages, highlighting major disparities in post-editing effectiveness for languages at different degrees of typological relatedness to English, even when controlling for system architecture and training data size. We publicly release the complete dataset including all collected behavioral data, to foster new research on the translation capabilities of NMT systems for typologically diverse languages.
pdf
bib
abs
CREAMT: Creativity and narrative engagement of literary texts translated by translators and NMT
Ana Guerberof Arenas
|
Antonio Toral
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation
We present here the EU-funded project CREAMT that seeks to understand what is meant by creativity in different translation modalities, e.g. machine translation, post-editing or professional translation. Focusing on the textual elements that determine creativity in translated literary texts and the reader experience, CREAMT uses a novel, interdisciplinary approach to assess how effective MT is in literary translation considering creativity in translation and the ultimate user: the reader.
2019
pdf
bib
What is the impact of raw MT on Japanese users of Word: preliminary results of a usability study using eye-tracking
Ana Guerberof Arenas
|
Joss Moorkens
|
Sharon O’Brien
Proceedings of Machine Translation Summit XVII: Research Track
2018
pdf
bib
abs
Reading Comprehension of Machine Translation Output: What Makes for a Better Read?
Sheila Castilho
|
Ana Guerberof Arenas
Proceedings of the 21st Annual Conference of the European Association for Machine Translation
This paper reports on a pilot experiment that compares two different machine translation (MT) paradigms in reading comprehension tests. To explore a suitable methodology, we set up a pilot experiment with a group of six users (with English, Spanish and Simplified Chinese languages) using an English Language Testing System (IELTS), and an eye-tracker. The users were asked to read three texts in their native language: either the original English text (for the English speakers) or the machine-translated text (for the Spanish and Simplified Chinese speakers). The original texts were machine-translated via two MT systems: neural (NMT) and statistical (SMT). The users were also asked to rank satisfaction statements on a 3-point scale after reading each text and answering the respective comprehension questions. After all tasks were completed, a post-task retrospective interview took place to gather qualitative data. The findings suggest that the users from the target languages completed more tasks in less time with a higher level of satisfaction when using translations from the NMT system.