Automatic detection of (potential) factors in the source text leading to gender bias in machine translation

Janiça Hackenbuchner, Arda Tezcan, Joke Daems


Abstract
This research project aims to develop a comprehensive methodology to help make machine translation (MT) systems more gender-inclusive for society. The goal is the creation of a detection system, a machine learning (ML) model trained on manual annotations, that can automatically analyse source data and detect and highlight words and phrases that influence the gender bias inflection in target translations.The main research outputs will be (1) a manually annotated dataset, (2) a taxonomy, and (3) a fine-tuned model.
Anthology ID:
2024.eamt-2.14
Volume:
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 2)
Month:
June
Year:
2024
Address:
Sheffield, UK
Editors:
Carolina Scarton, Charlotte Prescott, Chris Bayliss, Chris Oakley, Joanna Wright, Stuart Wrigley, Xingyi Song, Edward Gow-Smith, Mikel Forcada, Helena Moniz
Venue:
EAMT
SIG:
Publisher:
European Association for Machine Translation (EAMT)
Note:
Pages:
27–28
Language:
URL:
https://aclanthology.org/2024.eamt-2.14
DOI:
Bibkey:
Cite (ACL):
Janiça Hackenbuchner, Arda Tezcan, and Joke Daems. 2024. Automatic detection of (potential) factors in the source text leading to gender bias in machine translation. In Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 2), pages 27–28, Sheffield, UK. European Association for Machine Translation (EAMT).
Cite (Informal):
Automatic detection of (potential) factors in the source text leading to gender bias in machine translation (Hackenbuchner et al., EAMT 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.eamt-2.14.pdf