Analyzing Gender Translation Errors to Identify Information Flows between the Encoder and Decoder of a NMT System

Guillaume Wisniewski, Lichao Zhu, Nicolas Ballier, François Yvon


Abstract
Multiple studies have shown that existing NMT systems demonstrate some kind of “gender bias”. As a result, MT output appears to err more often for feminine forms and to amplify social gender misrepresentations, which is potentially harmful to users and practioners of these technologies. This paper continues this line of investigations and reports results obtained with a new test set in strictly controlled conditions. This setting allows us to better understand the multiple inner mechanisms that are causing these biases, which include the linguistic expressions of gender, the unbalanced distribution of masculine and feminine forms in the language, the modelling of morphological variation and the training process dynamics. To counterbalance these effects, we formulate several proposals and notably show that modifying the training loss can effectively mitigate such biases.
Anthology ID:
2022.blackboxnlp-1.13
Volume:
Proceedings of the Fifth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates (Hybrid)
Editors:
Jasmijn Bastings, Yonatan Belinkov, Yanai Elazar, Dieuwke Hupkes, Naomi Saphra, Sarah Wiegreffe
Venue:
BlackboxNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
153–163
Language:
URL:
https://aclanthology.org/2022.blackboxnlp-1.13
DOI:
10.18653/v1/2022.blackboxnlp-1.13
Bibkey:
Cite (ACL):
Guillaume Wisniewski, Lichao Zhu, Nicolas Ballier, and François Yvon. 2022. Analyzing Gender Translation Errors to Identify Information Flows between the Encoder and Decoder of a NMT System. In Proceedings of the Fifth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, pages 153–163, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):
Analyzing Gender Translation Errors to Identify Information Flows between the Encoder and Decoder of a NMT System (Wisniewski et al., BlackboxNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.blackboxnlp-1.13.pdf