Leveraging Prototypical Representations for Mitigating Social Bias without Demographic Information

Shadi Iskander, Kira Radinsky, Yonatan Belinkov


Abstract
Mitigating social biases typically requires identifying the social groups associated with each data sample. In this paper, we present DAFair, a novel approach to address social bias in language models. Unlike traditional methods that rely on explicit demographic labels, our approach does not require any such information. Instead, we leverage predefined prototypical demographic texts and incorporate a regularization term during the fine-tuning process to mitigate bias in the model’s representations. Our empirical results across two tasks and two models demonstrate the effectiveness of our method compared to previous approaches that do not rely on labeled data. Moreover, with limited demographic-annotated data, our approach outperforms common debiasing approaches.
Anthology ID:
2024.naacl-short.33
Volume:
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
379–390
Language:
URL:
https://aclanthology.org/2024.naacl-short.33
DOI:
10.18653/v1/2024.naacl-short.33
Bibkey:
Cite (ACL):
Shadi Iskander, Kira Radinsky, and Yonatan Belinkov. 2024. Leveraging Prototypical Representations for Mitigating Social Bias without Demographic Information. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers), pages 379–390, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Leveraging Prototypical Representations for Mitigating Social Bias without Demographic Information (Iskander et al., NAACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.naacl-short.33.pdf