Dennis Fucci

2025

Different Speech Translation Models Encode and Translate Speaker Gender Differently
Dennis Fucci | Marco Gaido | Matteo Negri | Luisa Bentivogli | Andre Martins | Giuseppe Attanasio
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Recent studies on interpreting the hidden states of speech models have shown their ability to capture speaker-specific features, including gender. Does this finding also hold for speech translation (ST) models? If so, what are the implications for the speaker’s gender assignment in translation? We address these questions from an interpretability perspective, using probing methods to assess gender encoding across diverse ST models. Results on three language directions (English → French/Italian/Spanish) indicate that while traditional encoder-decoder models capture gender information, newer architectures—integrating a speech encoder with a machine translation system via adapters—do not. We also demonstrate that low gender encoding capabilities result in systems’ tendency toward a masculine default, a translation bias that is more pronounced in newer architectures.

pdf bib abs

The Unheard Alternative: Contrastive Explanations for Speech-to-Text Models
Lina Conti | Dennis Fucci | Marco Gaido | Matteo Negri | Guillaume Wisniewski | Luisa Bentivogli
Proceedings of the 8th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP

Contrastive explanations, which indicate why an AI system produced one output (the target) instead of another (the foil), are widely recognized in explainable AI as more informative and interpretable than standard explanations. However, obtaining such explanations for speech-to-text (S2T) generative models remains an open challenge. Adopting a feature attribution framework, we propose the first method to obtain contrastive explanations in S2T by analyzing how specific regions of the input spectrogram influence the choice between alternative outputs. Through a case study on gender translation in speech translation, we show that our method accurately identifies the audio features that drive the selection of one gender over another.

2024

pdf bib abs

Twists, Humps, and Pebbles: Multilingual Speech Recognition Models Exhibit Gender Performance Gaps
Giuseppe Attanasio | Beatrice Savoldi | Dennis Fucci | Dirk Hovy
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Current automatic speech recognition (ASR) models are designed to be used across many languages and tasks without substantial changes. However, this broad language coverage hides performance gaps within languages, for example, across genders. Our study systematically evaluates the performance of two widely used multilingual ASR models on three datasets, encompassing 19 languages from eight language families and two speaking conditions. Our findings reveal clear gender disparities, with the advantaged group varying across languages and models. Surprisingly, those gaps are not explained by acoustic or lexical properties. However, probing internal model states reveals a correlation with gendered performance gap. That is, the easier it is to distinguish speaker gender in a language using probes, the more the gap reduces, favoring female speakers. Our results show that gender disparities persist even in state-of-the-art models. Our findings have implications for the improvement of multilingual ASR systems, underscoring the importance of accessibility to training data and nuanced evaluation to predict and mitigate gender gaps. We release all code and artifacts at https://github.com/g8a9/multilingual-asr-gender-gap.

pdf bib abs

A Prompt Response to the Demand for Automatic Gender-Neutral Translation
Beatrice Savoldi | Andrea Piergentili | Dennis Fucci | Matteo Negri | Luisa Bentivogli
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers)

Gender-neutral translation (GNT) that avoids biased and undue binary assumptions is a pivotal challenge for the creation of more inclusive translation technologies. Advancements for this task in Machine Translation (MT), however, are hindered by the lack of dedicated parallel data, which are necessary to adapt MT systems to satisfy neutral constraints. For such a scenario, large language models offer hitherto unforeseen possibilities, as they come with the distinct advantage of being versatile in various (sub)tasks when provided with explicit instructions. In this paper, we explore this potential to automate GNT by comparing MT with the popular GPT-4 model. Through extensive manual analyses, our study empirically reveals the inherent limitations of current MT systems in generating GNTs and provides valuable insights into the potential and challenges associated with prompting for neutrality.

pdf bib abs

Explainability for Speech Models: On the Challenges of Acoustic Feature Selection
Dennis Fucci | Beatrice Savoldi | Marco Gaido | Matteo Negri | Mauro Cettolo | Luisa Bentivogli
Proceedings of the Tenth Italian Conference on Computational Linguistics (CLiC-it 2024)

Spurred by the demand for transparency and interpretability in Artificial Intelligence (AI), the field of eXplainable AI (XAI) has experienced significant growth, marked by both theoretical reflections and technical advancements. While various XAI techniques, especially feature attribution methods, have been extensively explored across diverse tasks, their adaptation for the speech modality is lagging behind. We argue that a key factor hindering the diffusion of such methods in speech processing research lies in the complexity of defining interpretable acoustic features. In this paper, we discuss the key challenges in selecting the features for speech explanations. Also in light of existing research, we highlight current gaps and propose future avenues to enhance the depth and informativeness of explanations for speech.

2023

pdf bib

How To Build Competitive Multi-gender Speech Translation Models For Controlling Speaker Gender Translation
Marco Gaido | Dennis Fucci | Matteo Negri | Luisa Bentivogli
Proceedings of the Ninth Italian Conference on Computational Linguistics (CLiC-it 2023)

pdf bib abs

Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection
Dennis Fucci | Marco Gaido | Sara Papi | Mauro Cettolo | Matteo Negri | Luisa Bentivogli
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

When translating words referring to the speaker, speech translation (ST) systems should not resort to default masculine generics nor rely on potentially misleading vocal traits. Rather, they should assign gender according to the speakers’ preference. The existing solutions to do so, though effective, are hardly feasible in practice as they involve dedicated model re-training on gender-labeled ST data. To overcome these limitations, we propose the first inference-time solution to control speaker-related gender inflections in ST. Our approach partially replaces the (biased) internal language model (LM) implicitly learned by the ST decoder with gender-specific external LMs. Experiments on en→es/fr/it show that our solution outperforms the base models and the best training-time mitigation strategy by up to 31.0 and 1.6 points in gender accuracy, respectively, for feminine forms. The gains are even larger (up to 32.0 and 3.4) in the challenging condition where speakers’ vocal traits conflict with their gender.

pdf bib abs

Hi Guys or Hi Folks? Benchmarking Gender-Neutral Machine Translation with the GeNTE Corpus
Andrea Piergentili | Beatrice Savoldi | Dennis Fucci | Matteo Negri | Luisa Bentivogli
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Gender inequality is embedded in our communication practices and perpetuated in translation technologies. This becomes particularly apparent when translating into grammatical gender languages, where machine translation (MT) often defaults to masculine and stereotypical representations by making undue binary gender assumptions. Our work addresses the rising demand for inclusive language by focusing head-on on gender-neutral translation from English to Italian. We start from the essentials: proposing a dedicated benchmark and exploring automated evaluation methods. First, we introduce GeNTE, a natural, bilingual test set for gender-neutral translation, whose creation was informed by a survey on the perception and use of neutral language. Based on GeNTE, we then overview existing reference-based evaluation approaches, highlight their limits, and propose a reference-free method more suitable to assess gender-neutral translation.

pdf bib abs

Gender Neutralization for an Inclusive Machine Translation: from Theoretical Foundations to Open Challenges
Andrea Piergentili | Dennis Fucci | Beatrice Savoldi | Luisa Bentivogli | Matteo Negri
Proceedings of the First Workshop on Gender-Inclusive Translation Technologies

Gender inclusivity in language technologies has become a prominent research topic. In this study, we explore gender-neutral translation (GNT) as a form of gender inclusivity and a goal to be achieved by machine translation (MT) models, which have been found to perpetuate gender bias and discrimination. Specifically, we focus on translation from English into Italian, a language pair representative of salient gender-related linguistic transfer problems. To define GNT, we review a selection of relevant institutional guidelines for gender-inclusive language, discuss its scenarios of use, and examine the technical challenges of performing GNT in MT, concluding with a discussion of potential solutions to encourage advancements toward greater inclusivity in MT.

2022

pdf bib abs

The primary goal of this FBK’s systems submission to the IWSLT 2022 offline and simultaneous speech translation tasks is to reduce model training costs without sacrificing translation quality. As such, we first question the need of ASR pre-training, showing that it is not essential to achieve competitive results. Second, we focus on data filtering, showing that a simple method that looks at the ratio between source and target characters yields a quality improvement of 1 BLEU. Third, we compare different methods to reduce the detrimental effect of the audio segmentation mismatch between training data manually segmented at sentence level and inference data that is automatically segmented. Towards the same goal of training cost reduction, we participate in the simultaneous task with the same model trained for offline ST. The effectiveness of our lightweight training strategy is shown by the high score obtained on the MuST-C en-de corpus (26.7 BLEU) and is confirmed in high-resource data conditions by a 1.6 BLEU improvement on the IWSLT2020 test set over last year’s winning system.

Co-authors

André F. T. Martins 1

Marco Turchi 1

Guillaume Wisniewski 1

Venues

GITT1

IWSLT1

WS1

Fix author