Insights from a Disaggregated Analysis of Kinds of Biases in a Multicultural Dataset

Guido Ivetta; Hernán Maina; Luciana Benotti

doi:10.18653/v1/2025.winlp-main.20

Insights from a Disaggregated Analysis of Kinds of Biases in a Multicultural Dataset

Guido Ivetta, Hernán Maina, Luciana Benotti

Abstract

Warning: This paper contains explicit statements of offensive stereotypes which may be upsetting.Stereotypes vary across cultural contexts, making it essential to understand how language models encode social biases. MultiLingualCrowsPairs is a dataset of culturally adapted stereotypical and anti-stereotypical sentence pairs across nine languages. While prior work has primarily reported average fairness metrics on masked language models, this paper analyzes social biases in generative models by disaggregating results across specific bias types.We find that although most languages show an overall preference for stereotypical sentences, this masks substantial variation across different types of bias, such as gender, religion, and socioeconomic status. Our findings underscore that relying solely on aggregated metrics can obscure important patterns, and that fine-grained, bias-specific analysis is critical for meaningful fairness evaluation.

Anthology ID:: 2025.winlp-main.20
Volume:: Proceedings of the 9th Widening NLP Workshop
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Chen Zhang, Emily Allaway, Hua Shen, Lesly Miculicich, Yinqiao Li, Meryem M'hamdi, Peerat Limkonchotiwat, Richard He Bai, Santosh T.y.s.s., Sophia Simeng Han, Surendrabikram Thapa, Wiem Ben Rim
Venues:: WiNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 116–122
Language:
URL:: https://aclanthology.org/2025.winlp-main.20/
DOI:: 10.18653/v1/2025.winlp-main.20
Bibkey:
Cite (ACL):: Guido Ivetta, Hernán Maina, and Luciana Benotti. 2025. Insights from a Disaggregated Analysis of Kinds of Biases in a Multicultural Dataset. In Proceedings of the 9th Widening NLP Workshop, pages 116–122, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Insights from a Disaggregated Analysis of Kinds of Biases in a Multicultural Dataset (Ivetta et al., WiNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.winlp-main.20.pdf

PDF Cite Search Fix data