Eva Vanmassenhove

2025

Proceedings of the 3rd Workshop on Gender-Inclusive Translation Technologies (GITT 2025)
Janiça Hackenbuchner | Luisa Bentivogli | Joke Daems | Chiara Manna | Beatrice Savoldi | Eva Vanmassenhove
Proceedings of the 3rd Workshop on Gender-Inclusive Translation Technologies (GITT 2025)

pdf bib abs

Are We Paying Attention to Her? Investigating Gender Disambiguation and Attention in Machine Translation
Chiara Manna | Afra Alishahi | Frédéric Blain | Eva Vanmassenhove
Proceedings of the 3rd Workshop on Gender-Inclusive Translation Technologies (GITT 2025)

While gender bias in modern Neural Machine Translation (NMT) systems has received much attention, the traditional evaluation metrics for these systems do not fully capture the extent to which models integrate contextual gender cues. We propose a novel evaluation metric called Minimal Pair Accuracy (MPA) which measures the reliance of models on gender cues for gender disambiguation. We evaluate a number of NMT models using this metric, we show that they ignore available gender cues in most cases in favour of (statistical) stereotypical gender interpretation. We further show that in anti-stereotypical cases, these models tend to more consistently take male gender cues into account while ignoring the female cues. Finally, we analyze the attention head weights in the encoder component of these models and show that while all models to some extent encode gender information, the male gender cues elicit a more diffused response compared to the more concentrated and specialized responses to female gender cues.

2024

pdf bib

Proceedings of the 2nd International Workshop on Gender-Inclusive Translation Technologies
Beatrice Savoldi | Janiça Hackenbuchner | Luisa Bentivogli | Joke Daems | Eva Vanmassenhove | Jasmijn Bastings
Proceedings of the 2nd International Workshop on Gender-Inclusive Translation Technologies

2023

pdf bib abs

Tailoring Domain Adaptation for Machine Translation Quality Estimation
Javad Pourmostafa Roshan Sharami | Dimitar Shterionov | Frédéric Blain | Eva Vanmassenhove | Mirella De Sisto | Chris Emmery | Pieter Spronck
Proceedings of the 24th Annual Conference of the European Association for Machine Translation

While quality estimation (QE) can play an important role in the translation process, its effectiveness relies on the availability and quality of training data. For QE in particular, high-quality labeled data is often lacking due to the high-cost and effort associated with labeling such data. Aside from the data scarcity challenge, QE models should also be generalizabile, i.e., they should be able to handle data from different domains, both generic and specific. To alleviate these two main issues — data scarcity and domain mismatch — this paper combines domain adaptation and data augmentation within a robust QE system. Our method is to first train a generic QE model and then fine-tune it on a specific domain while retaining generic knowledge. Our results show a significant improvement for all the language pairs investigated, better cross-lingual inference, and a superior performance in zero-shot learning scenarios as compared to state-of-the-art baselines.

pdf bib

Proceedings of the First Workshop on Gender-Inclusive Translation Technologies
Eva Vanmassenhove | Beatrice Savoldi | Luisa Bentivogli | Joke Daems | Janiça Hackenbuchner
Proceedings of the First Workshop on Gender-Inclusive Translation Technologies

2022

pdf bib abs

“Vaderland”, “Volk” and “Natie”: Semantic Change Related to Nationalism in Dutch Literature Between 1700 and 1880 Captured with Dynamic Bernoulli Word Embeddings
Marije Timmermans | Eva Vanmassenhove | Dimitar Shterionov
Proceedings of the 3rd Workshop on Computational Approaches to Historical Language Change

Languages can respond to external events in various ways - the creation of new words or named entities, additional senses might develop for already existing words or the valence of words can change. In this work, we explore the semantic shift of the Dutch words “natie” (“nation”), “volk” (“people”) and “vaderland” (“fatherland”) over a period that is known for the rise of nationalism in Europe: 1700-1880. The semantic change is measured by means of Dynamic Bernoulli Word Embeddings which allow for comparison between word embeddings over different time slices. The word embeddings were generated based on Dutch fiction literature divided over different decades. From the analysis of the absolute drifts, it appears that the word “natie” underwent a relatively small drift. However, the drifts of “vaderland’” and “volk”’ show multiple peaks, culminating around the turn of the nineteenth century. To verify whether this semantic change can indeed be attributed to nationalistic movements, a detailed analysis of the nearest neighbours of the target words is provided. From the analysis, it appears that “natie”, “volk” and “vaderlan”’ became more nationalistically-loaded over time.

2021

pdf bib abs

Generating Gender Augmented Data for NLP
Nishtha Jain | Maja Popović | Declan Groves | Eva Vanmassenhove
Proceedings of the 3rd Workshop on Gender Bias in Natural Language Processing

Gender bias is a frequent occurrence in NLP-based applications, especially pronounced in gender-inflected languages. Bias can appear through associations of certain adjectives and animate nouns with the natural gender of referents, but also due to unbalanced grammatical gender frequencies of inflected words. This type of bias becomes more evident in generating conversational utterances where gender is not specified within the sentence, because most current NLP applications still work on a sentence-level context. As a step towards more inclusive NLP, this paper proposes an automatic and generalisable re-writing approach for short conversational sentences. The rewriting method can be applied to sentences that, without extra-sentential context, have multiple equivalent alternatives in terms of gender. The method can be applied both for creating gender balanced outputs as well as for creating gender balanced training data. The proposed approach is based on a neural machine translation system trained to ‘translate’ from one gender alternative to another. Both the automatic and manual analysis of the approach show promising results with respect to the automatic generation of gender alternatives for conversational sentences in Spanish.

pdf bib abs

NeuTral Rewriter: A Rule-Based and Neural Approach to Automatic Rewriting into Gender Neutral Alternatives
Eva Vanmassenhove | Chris Emmery | Dimitar Shterionov
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Recent years have seen an increasing need for gender-neutral and inclusive language. Within the field of NLP, there are various mono- and bilingual use cases where gender inclusive language is appropriate, if not preferred due to ambiguity or uncertainty in terms of the gender of referents. In this work, we present a rule-based and a neural approach to gender-neutral rewriting for English along with manually curated synthetic data (WinoBias+) and natural data (OpenSubtitles and Reddit) benchmarks. A detailed manual and automatic evaluation highlights how our NeuTral Rewriter, trained on data generated by the rule-based approach, obtains word error rates (WER) below 0.18% on synthetic, in-domain and out-domain test sets.

pdf bib abs

Machine Translationese: Effects of Algorithmic Bias on Linguistic Complexity in Machine Translation
Eva Vanmassenhove | Dimitar Shterionov | Matthew Gwilliam
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Recent studies in the field of Machine Translation (MT) and Natural Language Processing (NLP) have shown that existing models amplify biases observed in the training data. The amplification of biases in language technology has mainly been examined with respect to specific phenomena, such as gender bias. In this work, we go beyond the study of gender in MT and investigate how bias amplification might affect language in a broader sense. We hypothesize that the ‘algorithmic bias’, i.e. an exacerbation of frequently observed patterns in combination with a loss of less frequent ones, not only exacerbates societal biases present in current datasets but could also lead to an artificially impoverished language: ‘machine translationese’. We assess the linguistic richness (on a lexical and morphological level) of translations created by different data-driven MT paradigms – phrase-based statistical (PB-SMT) and neural MT (NMT). Our experiments show that there is a loss of lexical and syntactic richness in the translations produced by all investigated MT paradigms for two language pairs (EN-FR and EN-ES).

pdf bib abs

A Comparison of Different NMT Approaches to Low-Resource Dutch-Albanian Machine Translation
Arbnor Rama | Eva Vanmassenhove
Proceedings of the 4th Workshop on Technologies for MT of Low Resource Languages (LoResMT2021)

Low-resource languages can be understood as languages that are more scarce, less studied, less privileged, less commonly taught and for which there are less resources available (Singh, 2008; Cieri et al., 2016; Magueresse et al., 2020). Natural Language Processing (NLP) research and technology mainly focuses on those languages for which there are large data sets available. To illustrate differences in data availability: there are 6 million Wikipedia articles available for English, 2 million for Dutch, and merely 82 thousand for Albanian. The scarce data issue becomes increasingly apparent when large parallel data sets are required for applications such as Neural Machine Translation (NMT). In this work, we investigate to what extent translation between Albanian (SQ) and Dutch (NL) is possible comparing a one-to-one (SQ↔AL) model, a low-resource pivot-based approach (English (EN) as pivot) and a zero-shot translation (ZST) (Johnson et al., 2016; Mattoni et al., 2017) system. From our experiments, it results that the EN-pivot-model outperforms both the direct one-to-one and the ZST model. Since often, small amounts of parallel data are available for low-resource languages or settings, experiments were conducted using small sets of parallel NL↔SQ data. The ZST appeared to be the worst performing models. Even when the available parallel data (NL↔SQ) was added, i.e. in a few-shot setting (FST), it remained the worst performing system according to the automatic (BLEU and TER) and human evaluation.

pdf bib abs

gENder-IT: An Annotated English-Italian Parallel Challenge Set for Cross-Linguistic Natural Gender Phenomena
Eva Vanmassenhove | Johanna Monti
Proceedings of the 3rd Workshop on Gender Bias in Natural Language Processing

Languages differ in terms of the absence or presence of gender features, the number of gender classes and whether and where gender features are explicitly marked. These cross-linguistic differences can lead to ambiguities that are difficult to resolve, especially for sentence-level MT systems. The identification of ambiguity and its subsequent resolution is a challenging task for which currently there aren’t any specific resources or challenge sets available. In this paper, we introduce gENder-IT, an English–Italian challenge set focusing on the resolution of natural gender phenomena by providing word-level gender tags on the English source side and multiple gender alternative translations, where needed, on the Italian target side.

2020

bib

A Case Study of Natural Gender Phenomena in Translation: A Comparison of Google Translate, Bing Microsoft Translator and DeepL for English to Italian, French and Spanish
Argentina Anna Rescigno | Johanna Monti | Andy Way | Eva Vanmassenhove
Workshop on the Impact of Machine Translation (iMpacT 2020)

pdf bib

A Case Study of Natural Gender Phenomena in Translation. A Comparison of Google Translate, Bing Microsoft Translator and DeepL for English to Italian, French and Spanish
Argentina Anna Rescigno | Eva Vanmassenhove | Johanna Monti | Andy Way
Proceedings of the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020)

2019

pdf bib

Lost in Translation: Loss and Decay of Linguistic Richness in Machine Translation
Eva Vanmassenhove | Dimitar Shterionov | Andy Way
Proceedings of Machine Translation Summit XVII: Research Track

2018

pdf bib abs

Europarl Datasets with Demographic Speaker Information
Eva Vanmassenhove | Christian Hardmeier
Proceedings of the 21st Annual Conference of the European Association for Machine Translation

Research on speaker-adapted neural machine translation (NMT) is scarce. One of the main challenges for more personalized MT systems is finding large enough annotated parallel datasets with speaker information. Rabinovich et al. (2017) published an annotated parallel dataset for EN–FR and EN–DE, however, for many other language pairs no sufficiently large annotated datasets are available.

pdf bib abs

Getting Gender Right in Neural Machine Translation
Eva Vanmassenhove | Christian Hardmeier | Andy Way
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Speakers of different languages must attend to and encode strikingly different aspects of the world in order to use their language correctly (Sapir, 1921; Slobin, 1996). One such difference is related to the way gender is expressed in a language. Saying “I am happy” in English, does not encode any additional knowledge of the speaker that uttered the sentence. However, many other languages do have grammatical gender systems and so such knowledge would be encoded. In order to correctly translate such a sentence into, say, French, the inherent gender information needs to be retained/recovered. The same sentence would become either “Je suis heureux”, for a male speaker or “Je suis heureuse” for a female one. Apart from morphological agreement, demographic factors (gender, age, etc.) also influence our use of language in terms of word choices or syntactic constructions (Tannen, 1991; Pennebaker et al., 2003). We integrate gender information into NMT systems. Our contribution is two-fold: (1) the compilation of large datasets with speaker information for 20 language pairs, and (2) a simple set of experiments that incorporate gender information into NMT for multiple language pairs. Our experiments show that adding a gender feature to an NMT system significantly improves the translation quality for some language pairs.

pdf bib abs

SuperNMT: Neural Machine Translation with Semantic Supersenses and Syntactic Supertags
Eva Vanmassenhove | Andy Way
Proceedings of ACL 2018, Student Research Workshop

In this paper we incorporate semantic supersensetags and syntactic supertag features into EN–FR and EN–DE factored NMT systems. In experiments on various test sets, we observe that such features (and particularly when combined) help the NMT model training to converge faster and improve the model quality according to the BLEU scores.