2025
pdf
bib
abs
Mind the Inclusivity Gap: Multilingual Gender-Neutral Translation Evaluation with mGeNTE
Beatrice Savoldi
|
Giuseppe Attanasio
|
Eleonora Cupin
|
Eleni Gkovedarou
|
Janiça Hackenbuchner
|
Anne Lauscher
|
Matteo Negri
|
Andrea Piergentili
|
Manjinder Thind
|
Luisa Bentivogli
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Avoiding the propagation of undue (binary) gender inferences and default masculine language remains a key challenge towards inclusive multilingual technologies, particularly when translating into languages with extensive gendered morphology. Gender-neutral translation (GNT) represents a linguistic strategy towards fairer communication across languages. However, research on GNT is limited to a few resources and language pairs. To address this gap, we introduce mGeNTE, an expert-curated resource, and use it to conduct the first systematic multilingual evaluation of inclusive translation with state-of-the-art instruction-following language models (LMs). Experiments on en-es/de/it/el reveal that while models can recognize when neutrality is appropriate, they cannot consistently produce neutral translations, limiting their usability. To probe this behavior, we enrich our evaluation with interpretability analyses that identify task-relevant features and offer initial insights into the internal dynamics of LM-based GNT.
pdf
bib
abs
Glitter: A Multi-Sentence, Multi-Reference Benchmark for Gender-Fair German Machine Translation
A Pranav
|
Janiça Hackenbuchner
|
Giuseppe Attanasio
|
Manuel Lardelli
|
Anne Lauscher
Findings of the Association for Computational Linguistics: EMNLP 2025
Machine translation (MT) research addressing gender inclusivity has gained attention for promoting non-exclusionary language representing all genders. However, existing resources are limited in size, most often consisting of single sentences, or single gender-fair formulation types, leaving questions about MT models’ ability to use context and diverse inclusive forms. We introduce Glitter, an English-German benchmark featuring extended passages with professional translations implementing three gender-fair alternatives: neutral rewording, typographical solutions (gender star), and neologistic forms (-ens forms). Our experiments reveal significant limitations in state-of-the-art language models, which default to masculine generics, struggle to interpret explicit gender cues in context, and rarely produce gender-fair translations. Through a systematic prompting analysis designed to elicit fair language, we demonstrate that these limitations stem from models’ fundamental misunderstanding of gender phenomena, as they fail to implement inclusive forms even when explicitly instructed. Glitter establishes a challenging benchmark, advancing research in gender-fair English-German MT. It highlights substantial room for improvement among leading models and can guide the development of future MT models capable of accurately representing gender diversity.
pdf
bib
abs
GENDEROUS: Machine Translation and Cross-Linguistic Evaluation of a Gender-Ambiguous Dataset
Janiça Hackenbuchner
|
Eleni Gkovedarou
|
Joke Daems
Proceedings of the 6th Workshop on Gender Bias in Natural Language Processing (GeBNLP)
Contributing to research on gender beyond the binary, this work introduces GENDEROUS, a dataset of gender-ambiguous sentences containing gender-marked occupations and adjectives, and sentences with the ambiguous or non-binary pronoun their. We cross-linguistically evaluate how machine translation (MT) systems and large language models (LLMs) translate these sentences from English into four grammatical gender languages: Greek, German, Spanish and Dutch. We show the systems’ continued default to male-gendered translations, with exceptions (particularly for Dutch). Prompting for alternatives, however, shows potential in attaining more diverse and neutral translations across all languages. An LLM-as-a-judge approach was implemented, where benchmarking against gold standards emphasises the continued need for human annotations.
pdf
bib
Proceedings of the 3rd Workshop on Gender-Inclusive Translation Technologies (GITT 2025)
Janiça Hackenbuchner
|
Luisa Bentivogli
|
Joke Daems
|
Chiara Manna
|
Beatrice Savoldi
|
Eva Vanmassenhove
Proceedings of the 3rd Workshop on Gender-Inclusive Translation Technologies (GITT 2025)
2024
pdf
bib
abs
Automatic detection of (potential) factors in the source text leading to gender bias in machine translation
Janiça Hackenbuchner
|
Arda Tezcan
|
Joke Daems
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 2)
This research project aims to develop a comprehensive methodology to help make machine translation (MT) systems more gender-inclusive for society. The goal is the creation of a detection system, a machine learning (ML) model trained on manual annotations, that can automatically analyse source data and detect and highlight words and phrases that influence the gender bias inflection in target translations.The main research outputs will be (1) a manually annotated dataset, (2) a taxonomy, and (3) a fine-tuned model.
pdf
bib
abs
Literacy in Digital Environments and Resources (LT-LiDER)
Joss Moorkens
|
Pilar Sánchez-Gijón
|
Esther Simon
|
Mireia Urpí
|
Nora Aranberri
|
Dragoș Ciobanu
|
Ana Guerberof-Arenas
|
Janiça Hackenbuchner
|
Dorothy Kenny
|
Ralph Krüger
|
Miguel Rios
|
Isabel Ginel
|
Caroline Rossi
|
Alina Secară
|
Antonio Toral
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 2)
LT-LiDER is an Erasmus+ cooperation project with two main aims. The first is to map the landscape of technological capabilities required to work as a language and/or translation expert in the digitalised and datafied language industry. The second is to generate training outputs that will help language and translation trainers improve their skills and adopt appropriate pedagogical approaches and strategies for integrating data-driven technology into their language or translation classrooms, with a focus on digital and AI literacy.
pdf
bib
Proceedings of the 2nd International Workshop on Gender-Inclusive Translation Technologies
Beatrice Savoldi
|
Janiça Hackenbuchner
|
Luisa Bentivogli
|
Joke Daems
|
Eva Vanmassenhove
|
Jasmijn Bastings
Proceedings of the 2nd International Workshop on Gender-Inclusive Translation Technologies
pdf
bib
abs
You Shall Know a Word’s Gender by the Company it Keeps: Comparing the Role of Context in Human Gender Assumptions with MT
Janiça Hackenbuchner
|
Joke Daems
|
Arda Tezcan
|
Aaron Maladry
Proceedings of the 2nd International Workshop on Gender-Inclusive Translation Technologies
In this paper, we analyse to what extent machine translation (MT) systems and humans base their gender translations and associations on role names and on stereotypicality in the absence of (generic) grammatical gender cues in language. We compare an MT system’s choice of gender for a certain word when translating from a notional gender language, English, into a grammatical gender language, German, with thegender associations of humans. We outline a comparative case study of gender translation and annotation of words in isolation, out-of-context, and words in sentence contexts. The analysis reveals patterns of gender (bias) by MT and gender associations by humans for certain (1) out-of-context words and (2) words in-context. Our findings reveal the impact of context on gender choice and translation and show that word-level analyses fall short in such studies.
2023
pdf
bib
abs
DataLitMT – Teaching Data Literacy in the Context of Machine Translation Literacy
Janiça Hackenbuchner
|
Ralph Krüger
Proceedings of the 24th Annual Conference of the European Association for Machine Translation
This paper presents the DataLitMT project conducted at TH Koln – University of Applied Sciences. The project develops learning resources for teaching data literacy in its translation-specific form of professional machine translation (MT) literacy to students of translation and specialised communication programmes at BA and MA levels. We discuss the need for data literacy teaching in a translation/specialised communication context, present the three theoretical pillars of the project (consisting of a Professional MT Literacy Framework, an MT-specific data literacy framework and a competence matrix derived from these frameworks) and give an overview of the learning resources developed as part of the project.
pdf
bib
Proceedings of the First Workshop on Gender-Inclusive Translation Technologies
Eva Vanmassenhove
|
Beatrice Savoldi
|
Luisa Bentivogli
|
Joke Daems
|
Janiça Hackenbuchner
Proceedings of the First Workshop on Gender-Inclusive Translation Technologies
2022
pdf
bib
abs
DeBiasByUs: Raising Awareness and Creating a Database of MT Bias
Joke Daems
|
Janiça Hackenbuchner
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation
This paper presents the project initiated by the BiasByUs team resulting from the 2021 Artificially Correct Hackaton. We briefly explain our winning participation in the hackaton, tackling the challenge on ‘Database and detection of gender bi-as in A.I. translations’, we highlight the importance of gender bias in Machine Translation (MT), and describe our pro-posed solution to the challenge, the cur-rent status of the project, and our envi-sioned future collaborations and re-search.