Proceedings of the 2nd International Workshop on Gender-Inclusive Translation Technologies

Beatrice Savoldi, Janiça Hackenbuchner, Luisa Bentivogli, Joke Daems, Eva Vanmassenhove, Jasmijn Bastings (Editors)

Anthology ID:: 2024.gitt-1
Month:: June
Year:: 2024
Address:: Sheffield, United Kingdom
Venues:: GITT | WS
SIG:
Publisher:: European Association for Machine Translation (EAMT)
URL:: https://aclanthology.org/2024.gitt-1
DOI:
Bib Export formats:: BibTeX MODS XML EndNote
PDF:: https://aclanthology.org/2024.gitt-1.pdf

While Machine Translation (MT) research has progressed over the years, translation systems still suffer from biases, including gender bias. While an active line of research studies the existence and mitigation strategies of gender bias in machine translation systems, there is limited research exploring this phenomenon for low-resource languages. The limited availability of linguistic and computational resources confounded with the lack of benchmark datasets makes studying bias for low-resourced languages that much more difficult. In this paper, we construct benchmark datasets to evaluate gender bias in machine translation for three low-resource languages: Afaan Oromoo (Orm), Amharic (Amh), and Tigrinya (Tir). Building on prior work, we collected 2400 gender-balanced sentences parallelly translated into the three languages. From human evaluations of the dataset we collected, we found that about 93% of Afaan Oromoo, 80% of Tigrinya, and 72% of Amharic sentences exhibited gender bias. In addition to providing benchmarks for improving gender bias mitigation research in the three languages, we hope the careful documentation of our work will help other low-resourced language researchers extend our approach to their languages.

pdf bib abs
Sparks of Fairness: Preliminary Evidence of Commercial Machine Translation as English-to-German Gender-Fair Dictionaries
Manuel Lardelli | Timm Dill | Giuseppe Attanasio | Anne Lauscher

Bilingual dictionaries are bedrock components for several language tasks, including translation. However, dictionaries are traditionally fixed in time, thus excluding those neologisms and neo-morphemes that challenge the language’s nominal morphology. The need for a more dynamic, mutable alternative makes machine translation (MT) systems become an extremely valuable avenue. This paper investigates whether commercial MT can be used as bilingual dictionaries for gender-neutral translation. We focus on the English-to-German pair, where notional gender in the source requires gender inflection in the target. We translated 115 person-referring terms using Google Translate, Microsoft Bing, and DeepL and discovered that while each system is heavily biased towards the masculine gender, DeepL often provides gender-fair alternatives to users, especially with plurals.

pdf bib abs
Gender and bias in Amazon review translations: by humans, MT systems and ChatGPT
Maja Popovic | Ekaterina Lapshinova-Koltunski

This paper presents an analysis of first-person gender in five different translation variants of Amazon product reviews:those produced by professional translators, by translation students, with different machine translation (MT) systems andwith ChatGPT. The analysis revealed that the majority of the reviews were translated into the masculine first-person gender, both by humans as well as by machines. Further inspection revealed that the choice of the gender in a translation is not related to the actual gender of the translator. Finally, the analysis of different products showed that there are certain bias tendencies, because the distribution of genders notably differ for different products.

pdf bib abs
You Shall Know a Word’s Gender by the Company it Keeps: Comparing the Role of Context in Human Gender Assumptions with MT
Janiça Hackenbuchner | Joke Daems | Arda Tezcan | Aaron Maladry

In this paper, we analyse to what extent machine translation (MT) systems and humans base their gender translations and associations on role names and on stereotypicality in the absence of (generic) grammatical gender cues in language. We compare an MT system’s choice of gender for a certain word when translating from a notional gender language, English, into a grammatical gender language, German, with thegender associations of humans. We outline a comparative case study of gender translation and annotation of words in isolation, out-of-context, and words in sentence contexts. The analysis reveals patterns of gender (bias) by MT and gender associations by humans for certain (1) out-of-context words and (2) words in-context. Our findings reveal the impact of context on gender choice and translation and show that word-level analyses fall short in such studies.

pdf bib abs
Lost in Translation? Approaches to Gender Representation in Multilingual Archives
Mrinalini Luthra | Brecht Nijman

The GLOBALISE project’s digitalisation of the Dutch East India Company (VOC) archives raises questions about representing gender and marginalised identities. This paper outlines the challenges of accurately conveying gender information in the archives, highlighting issues such as the lack of self-identified gender descriptions, low representation of marginalised groups, colonial context, and multilingualism in the collection. Machine learning (ML) and machine translation (MT) used in the digitalisation process may amplify existing biases and under-representation. To address these issues, the paper proposes a gender policy for GLOBALISE, offering guidelines and methodologies for handling gender information and increasing the visibility of marginalised identities. The policy contributes to discussions about representing gender and diversity in digital historical research, ML, and MT.

pdf bib abs
Pilot testing gender-inclusive translations and machine translations for German quadball referee certification test takers
Joke Daems

Gender-inclusive translations are the default at the International Quadball Association, yet translators make different choices for the (timed) referee certification tests to improve readability. However, the actual impact of a strategy on readability and performance has not been tested. This pilot study explores the impact of translation strategy (masculine generic, gender-inclusive, and machine translation) on the speed, performance and perceptions of quadball referee test takers in German. It shows promise for inclusive over masculine strategies, and suggests limited usefulness of MT in this context.