The (Undesired) Attenuation of Human Biases by Multilinguality

Cristina España-Bonet; Alberto Barrón-Cedeño

doi:10.18653/v1/2022.emnlp-main.133

The (Undesired) Attenuation of Human Biases by Multilinguality

Cristina España-Bonet, Alberto Barrón-Cedeño

Abstract

Some human preferences are universal. The odor of vanilla is perceived as pleasant all around the world. We expect neural models trained on human texts to exhibit these kind of preferences, i.e. biases, but we show that this is not always the case. We explore 16 static and contextual embedding models in 9 languages and, when possible, compare them under similar training conditions. We introduce and release CA-WEAT, multilingual cultural aware tests to quantify biases, and compare them to previous English-centric tests. Our experiments confirm that monolingual static embeddings do exhibit human biases, but values differ across languages, being far from universal. Biases are less evident in contextual models, to the point that the original human association might be reversed. Multilinguality proves to be another variable that attenuates and even reverses the effect of the bias, specially in contextual multilingual models. In order to explain this variance among models and languages, we examine the effect of asymmetries in the training corpus, departures from isomorphism in multilingual embedding spaces and discrepancies in the testing measures between languages.

Anthology ID:: 2022.emnlp-main.133
Volume:: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2022
Address:: Abu Dhabi, United Arab Emirates
Editors:: Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2056–2077
Language:
URL:: https://aclanthology.org/2022.emnlp-main.133/
DOI:: 10.18653/v1/2022.emnlp-main.133
Bibkey:
Cite (ACL):: Cristina España-Bonet and Alberto Barrón-Cedeño. 2022. The (Undesired) Attenuation of Human Biases by Multilinguality. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2056–2077, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):: The (Undesired) Attenuation of Human Biases by Multilinguality (España-Bonet & Barrón-Cedeño, EMNLP 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.emnlp-main.133.pdf
Video:: https://aclanthology.org/2022.emnlp-main.133.mp4

PDF Cite Search Video Fix data