Test Suites Task: Evaluation of Gender Fairness in MT with MuST-SHE and INES

Beatrice Savoldi, Marco Gaido, Matteo Negri, Luisa Bentivogli


Abstract
As part of the WMT-2023 “Test suites” shared task, in this paper we summarize the results of two test suites evaluations: MuST-SHEWMT23 and INES. By focusing on the en-de and de-en language pairs, we rely on these newly created test suites to investigate systems’ ability to translate feminine and masculine gender and produce gender-inclusive translations. Furthermore we discuss metrics associated with our test suites and validate them by means of human evaluations. Our results indicate that systems achieve reasonable and comparable performance in correctly translating both feminine and masculine gender forms for naturalistic gender phenomena. Instead, the generation of inclusive language forms in translation emerges as a challenging task for all the evaluated MT models, indicating room for future improvements and research on the topic. We make MuST-SHEWMT23 and INES freely available.
Anthology ID:
2023.wmt-1.25
Volume:
Proceedings of the Eighth Conference on Machine Translation
Month:
December
Year:
2023
Address:
Singapore
Editors:
Philipp Koehn, Barry Haddow, Tom Kocmi, Christof Monz
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
252–262
Language:
URL:
https://aclanthology.org/2023.wmt-1.25
DOI:
10.18653/v1/2023.wmt-1.25
Bibkey:
Cite (ACL):
Beatrice Savoldi, Marco Gaido, Matteo Negri, and Luisa Bentivogli. 2023. Test Suites Task: Evaluation of Gender Fairness in MT with MuST-SHE and INES. In Proceedings of the Eighth Conference on Machine Translation, pages 252–262, Singapore. Association for Computational Linguistics.
Cite (Informal):
Test Suites Task: Evaluation of Gender Fairness in MT with MuST-SHE and INES (Savoldi et al., WMT 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.wmt-1.25.pdf