2024
pdf
bib
abs
Enhancing Gender-Inclusive Machine Translation with Neomorphemes and Large Language Models
Andrea Piergentili
|
Beatrice Savoldi
|
Matteo Negri
|
Luisa Bentivogli
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1)
Machine translation (MT) models are known to suffer from gender bias, especially when translating into languages with extensive gendered morphology. Accordingly, they still fall short in using gender-inclusive language, also representative of non-binary identities. In this paper, we look at gender-inclusive neomorphemes, neologistic elements that avoid binary gender markings as an approach towards fairer MT. In this direction, we explore prompting techniques with large language models (LLMs) to translate from English into Italian using neomorphemes. So far, this area has been under-explored due to its novelty and the lack of publicly available evaluation resources. We fill this gap by releasing NEO-GATE, a resource designed to evaluate gender-inclusive en→it translation with neomorphemes. With NEO-GATE, we assess four LLMs of different families and sizes and different prompt formats, identifying strengths and weaknesses of each on this novel task for MT.
pdf
bib
abs
Automatic Subtitling and Subtitle Compression: FBK at the IWSLT 2024 Subtitling track
Marco Gaido
|
Sara Papi
|
Mauro Cettolo
|
Roldano Cattoni
|
Andrea Piergentili
|
Matteo Negri
|
Luisa Bentivogli
Proceedings of the 21st International Conference on Spoken Language Translation (IWSLT 2024)
The paper describes the FBK submissions to the Subtitling track of the 2024 IWSLT Evaluation Campaign, which covers both the Automatic Subtitling and the Subtitle Compression task for two language pairs: English to German (en-de) and English to Spanish (en-es). For the Automatic Subtitling task, we submitted two systems: i) a direct model, trained in constrained conditions, that produces the SRT files from the audio without intermediate outputs (e.g., transcripts), and ii) a cascade solution that integrates only free-to-use components, either taken off-the-shelf or developed in-house. Results show that, on both language pairs, our direct model outperforms both cascade and direct systems trained in constrained conditions in last year’s edition of the campaign, while our cascade solution is competitive with the best 2023 runs. For the Subtitle Compression task, our primary submission involved prompting a Large Language Model (LLM) in zero-shot mode to shorten subtitles that exceed the reading speed limit of 21 characters per second. Our results highlight the challenges inherent in shrinking out-of-context sentence fragments that are automatically generated and potentially error-prone, underscoring the need for future studies to develop targeted solutions.
pdf
bib
abs
A Prompt Response to the Demand for Automatic Gender-Neutral Translation
Beatrice Savoldi
|
Andrea Piergentili
|
Dennis Fucci
|
Matteo Negri
|
Luisa Bentivogli
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers)
Gender-neutral translation (GNT) that avoids biased and undue binary assumptions is a pivotal challenge for the creation of more inclusive translation technologies. Advancements for this task in Machine Translation (MT), however, are hindered by the lack of dedicated parallel data, which are necessary to adapt MT systems to satisfy neutral constraints. For such a scenario, large language models offer hitherto unforeseen possibilities, as they come with the distinct advantage of being versatile in various (sub)tasks when provided with explicit instructions. In this paper, we explore this potential to automate GNT by comparing MT with the popular GPT-4 model. Through extensive manual analyses, our study empirically reveals the inherent limitations of current MT systems in generating GNTs and provides valuable insights into the potential and challenges associated with prompting for neutrality.
2023
pdf
bib
abs
Hi Guys or Hi Folks? Benchmarking Gender-Neutral Machine Translation with the GeNTE Corpus
Andrea Piergentili
|
Beatrice Savoldi
|
Dennis Fucci
|
Matteo Negri
|
Luisa Bentivogli
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Gender inequality is embedded in our communication practices and perpetuated in translation technologies. This becomes particularly apparent when translating into grammatical gender languages, where machine translation (MT) often defaults to masculine and stereotypical representations by making undue binary gender assumptions. Our work addresses the rising demand for inclusive language by focusing head-on on gender-neutral translation from English to Italian. We start from the essentials: proposing a dedicated benchmark and exploring automated evaluation methods. First, we introduce GeNTE, a natural, bilingual test set for gender-neutral translation, whose creation was informed by a survey on the perception and use of neutral language. Based on GeNTE, we then overview existing reference-based evaluation approaches, highlight their limits, and propose a reference-free method more suitable to assess gender-neutral translation.
pdf
bib
abs
Gender Neutralization for an Inclusive Machine Translation: from Theoretical Foundations to Open Challenges
Andrea Piergentili
|
Dennis Fucci
|
Beatrice Savoldi
|
Luisa Bentivogli
|
Matteo Negri
Proceedings of the First Workshop on Gender-Inclusive Translation Technologies
Gender inclusivity in language technologies has become a prominent research topic. In this study, we explore gender-neutral translation (GNT) as a form of gender inclusivity and a goal to be achieved by machine translation (MT) models, which have been found to perpetuate gender bias and discrimination. Specifically, we focus on translation from English into Italian, a language pair representative of salient gender-related linguistic transfer problems. To define GNT, we review a selection of relevant institutional guidelines for gender-inclusive language, discuss its scenarios of use, and examine the technical challenges of performing GNT in MT, concluding with a discussion of potential solutions to encourage advancements toward greater inclusivity in MT.