Beatrice Savoldi | Janiça Hackenbuchner | Luisa Bentivogli | Joke Daems | Eva Vanmassenhove | Jasmijn Bastings
Enhancing Gender-Inclusive Machine Translation with Neomorphemes and Large Language Models
Andrea Piergentili | Beatrice Savoldi | Matteo Negri | Luisa Bentivogli
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1)

Machine translation (MT) models are known to suffer from gender bias, especially when translating into languages with extensive gendered morphology. Accordingly, they still fall short in using gender-inclusive language, also representative of non-binary identities. In this paper, we look at gender-inclusive neomorphemes, neologistic elements that avoid binary gender markings as an approach towards fairer MT. In this direction, we explore prompting techniques with large language models (LLMs) to translate from English into Italian using neomorphemes. So far, this area has been under-explored due to its novelty and the lack of publicly available evaluation resources. We fill this gap by releasing NEO-GATE, a resource designed to evaluate gender-inclusive en→it translation with neomorphemes. With NEO-GATE, we assess four LLMs of different families and sizes and different prompt formats, identifying strengths and weaknesses of each on this novel task for MT.

FBK@IWSLT Test Suites Task: Gender Bias evaluation with MuST-SHE
Beatrice Savoldi | Marco Gaido | Matteo Negri | Luisa Bentivogli
Proceedings of the 21st International Conference on Spoken Language Translation (IWSLT 2024)

This paper presents the FBK contribution to the IWSLT-2024 ‘Test suites’ shared subtask, part of the Offline Speech Translation Task. Our contribution consists of the MuST-SHE-IWSLT24 benchmark evaluation, designed to assess gender bias in speech translation. By focusing on the en-de language pair, we rely on a newly created test suite to investigate systems’ ability to correctly translate feminine and masculine gender. Our results indicate that – under realistic conditions – current ST systems achieve reasonable and comparable performance in correctly translating both feminine and masculine forms when contextual gender information is available. For ambiguous references to the speaker, however, we attest a consistent preference towards masculine gender, thus calling for future endeavours on the topic. Towards this goal we make MuST-SHE-IWSLT24 freely available at:

A Prompt Response to the Demand for Automatic Gender-Neutral Translation
Beatrice Savoldi | Andrea Piergentili | Dennis Fucci | Matteo Negri | Luisa Bentivogli
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers)

Gender-neutral translation (GNT) that avoids biased and undue binary assumptions is a pivotal challenge for the creation of more inclusive translation technologies. Advancements for this task in Machine Translation (MT), however, are hindered by the lack of dedicated parallel data, which are necessary to adapt MT systems to satisfy neutral constraints. For such a scenario, large language models offer hitherto unforeseen possibilities, as they come with the distinct advantage of being versatile in various (sub)tasks when provided with explicit instructions. In this paper, we explore this potential to automate GNT by comparing MT with the popular GPT-4 model. Through extensive manual analyses, our study empirically reveals the inherent limitations of current MT systems in generating GNTs and provides valuable insights into the potential and challenges associated with prompting for neutrality.


Hi Guys or Hi Folks? Benchmarking Gender-Neutral Machine Translation with the GeNTE Corpus
Andrea Piergentili | Beatrice Savoldi | Dennis Fucci | Matteo Negri | Luisa Bentivogli
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Gender inequality is embedded in our communication practices and perpetuated in translation technologies. This becomes particularly apparent when translating into grammatical gender languages, where machine translation (MT) often defaults to masculine and stereotypical representations by making undue binary gender assumptions. Our work addresses the rising demand for inclusive language by focusing head-on on gender-neutral translation from English to Italian. We start from the essentials: proposing a dedicated benchmark and exploring automated evaluation methods. First, we introduce GeNTE, a natural, bilingual test set for gender-neutral translation, whose creation was informed by a survey on the perception and use of neutral language. Based on GeNTE, we then overview existing reference-based evaluation approaches, highlight their limits, and propose a reference-free method more suitable to assess gender-neutral translation.

Test Suites Task: Evaluation of Gender Fairness in MT with MuST-SHE and INES
Beatrice Savoldi | Marco Gaido | Matteo Negri | Luisa Bentivogli
Proceedings of the Eighth Conference on Machine Translation

As part of the WMT-2023 “Test suites” shared task, in this paper we summarize the results of two test suites evaluations: MuST-SHEWMT23 and INES. By focusing on the en-de and de-en language pairs, we rely on these newly created test suites to investigate systems’ ability to translate feminine and masculine gender and produce gender-inclusive translations. Furthermore we discuss metrics associated with our test suites and validate them by means of human evaluations. Our results indicate that systems achieve reasonable and comparable performance in correctly translating both feminine and masculine gender forms for naturalistic gender phenomena. Instead, the generation of inclusive language forms in translation emerges as a challenging task for all the evaluated MT models, indicating room for future improvements and research on the topic. We make MuST-SHEWMT23 and INES freely available.

Proceedings of the First Workshop on Gender-Inclusive Translation Technologies
Eva Vanmassenhove | Beatrice Savoldi | Luisa Bentivogli | Joke Daems | Janiça Hackenbuchner
Gender Neutralization for an Inclusive Machine Translation: from Theoretical Foundations to Open Challenges
Andrea Piergentili | Dennis Fucci | Beatrice Savoldi | Luisa Bentivogli | Matteo Negri
Proceedings of the First Workshop on Gender-Inclusive Translation Technologies

Gender inclusivity in language technologies has become a prominent research topic. In this study, we explore gender-neutral translation (GNT) as a form of gender inclusivity and a goal to be achieved by machine translation (MT) models, which have been found to perpetuate gender bias and discrimination. Specifically, we focus on translation from English into Italian, a language pair representative of salient gender-related linguistic transfer problems. To define GNT, we review a selection of relevant institutional guidelines for gender-inclusive language, discuss its scenarios of use, and examine the technical challenges of performing GNT in MT, concluding with a discussion of potential solutions to encourage advancements toward greater inclusivity in MT.


On the Dynamics of Gender Learning in Speech Translation
Beatrice Savoldi | Marco Gaido | Luisa Bentivogli | Matteo Negri | Marco Turchi
Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)

Due to the complexity of bias and the opaque nature of current neural approaches, there is a rising interest in auditing language technologies. In this work, we contribute to such a line of inquiry by exploring the emergence of gender bias in Speech Translation (ST). As a new perspective, rather than focusing on the final systems only, we examine their evolution over the course of training. In this way, we are able to account for different variables related to the learning dynamics of gender translation, and investigate when and how gender divides emerge in ST. Accordingly, for three language pairs (en ? es, fr, it) we compare how ST systems behave for masculine and feminine translation at several levels of granularity. We find that masculine and feminine curves are dissimilar, with the feminine one being characterized by more erratic behaviour and late improvements over the course of training. Also, depending on the considered phenomena, their learning trends can be either antiphase or parallel. Overall, we show how such a progressive analysis can inform on the reliability and time-wise acquisition of gender, which is concealed by static evaluations and standard metrics.

Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias in Speech Translation
Beatrice Savoldi | Marco Gaido | Luisa Bentivogli | Matteo Negri | Marco Turchi
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Gender bias is largely recognized as a problematic phenomenon affecting language technologies, with recent studies underscoring that it might surface differently across languages. However, most of current evaluation practices adopt a word-level focus on a narrow set of occupational nouns under synthetic conditions. Such protocols overlook key features of grammatical gender languages, which are characterized by morphosyntactic chains of gender agreement, marked on a variety of lexical items and parts-of-speech (POS). To overcome this limitation, we enrich the natural, gender-sensitive MuST-SHE corpus (Bentivogli et al., 2020) with two new linguistic annotation layers (POS and agreement chains), and explore to what extent different lexical categories and agreement phenomena are impacted by gender skews. Focusing on speech translation, we conduct a multifaceted evaluation on three language directions (English-French/Italian/Spanish), with models trained on varying amounts of data and different word segmentation techniques. By shedding light on model behaviours, gender bias, and its detection at several levels of granularity, our findings emphasize the value of dedicated analyses beyond aggregated overall results.


Gender Bias in Machine Translation
Beatrice Savoldi | Marco Gaido | Luisa Bentivogli | Matteo Negri | Marco Turchi
Transactions of the Association for Computational Linguistics, Volume 9

AbstractMachine translation (MT) technology has facilitated our daily tasks by providing accessible shortcuts for gathering, processing, and communicating information. However, it can suffer from biases that harm users and society at large. As a relatively new field of inquiry, studies of gender bias in MT still lack cohesion. This advocates for a unified framework to ease future research. To this end, we: i) critically review current conceptualizations of bias in light of theoretical insights from related disciplines, ii) summarize previous analyses aimed at assessing gender bias in MT, iii) discuss the mitigating strategies proposed so far, and iv) point toward potential directions for future work.

How to Split: the Effect of Word Segmentation on Gender Bias in Speech Translation
Marco Gaido | Beatrice Savoldi | Luisa Bentivogli | Matteo Negri | Marco Turchi
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021


Breeding Gender-aware Direct Speech Translation Systems
Marco Gaido | Beatrice Savoldi | Luisa Bentivogli | Matteo Negri | Marco Turchi
Proceedings of the 28th International Conference on Computational Linguistics

In automatic speech translation (ST), traditional cascade approaches involving separate transcription and translation steps are giving ground to increasingly competitive and more robust direct solutions. In particular, by translating speech audio data without intermediate transcription, direct ST models are able to leverage and preserve essential information present in the input (e.g.speaker’s vocal characteristics) that is otherwise lost in the cascade framework. Although such ability proved to be useful for gender translation, direct ST is nonetheless affected by gender bias just like its cascade counterpart, as well as machine translation and numerous other natural language processing applications. Moreover, direct ST systems that exclusively rely on vocal biometric features as a gender cue can be unsuitable or even potentially problematic for certain users. Going beyond speech signals, in this paper we compare different approaches to inform direct ST models about the speaker’s gender and test their ability to handle gender translation from English into Italian and French. To this aim, we manually annotated large datasets with speak-ers’ gender information and used them for experiments reflecting different possible real-world scenarios. Our results show that gender-aware direct ST solutions can significantly outperform strong – but gender-unaware – direct ST models. In particular, the translation of gender-marked words can increase up to 30 points in accuracy while preserving overall translation quality.

Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus
Luisa Bentivogli | Beatrice Savoldi | Matteo Negri | Mattia A. Di Gangi | Roldano Cattoni | Marco Turchi
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Translating from languages without productive grammatical gender like English into gender-marked languages is a well-known difficulty for machines. This difficulty is also due to the fact that the training data on which models are built typically reflect the asymmetries of natural languages, gender bias included. Exclusively fed with textual data, machine translation is intrinsically constrained by the fact that the input sentence does not always contain clues about the gender identity of the referred human entities. But what happens with speech translation, where the input is an audio signal? Can audio provide additional information to reduce gender bias? We present the first thorough investigation of gender bias in speech translation, contributing with: i) the release of a benchmark useful for future studies, and ii) the comparison of different technologies (cascade and end-to-end) on two language directions (English-Italian/French).