Addressing Both Statistical and Causal Gender Fairness in NLP Models

Hannah Chen, Yangfeng Ji, David Evans


Abstract
Statistical fairness stipulates equivalent outcomes for every protected group, whereas causal fairness prescribes that a model makes the same prediction for an individual regardless of their protected characteristics. Counterfactual data augmentation (CDA) is effective for reducing bias in NLP models, yet models trained with CDA are often evaluated only on metrics that are closely tied to the causal fairness notion; similarly, sampling-based methods designed to promote statistical fairness are rarely evaluated for causal fairness. In this work, we evaluate both statistical and causal debiasing methods for gender bias in NLP models, and find that while such methods are effective at reducing bias as measured by the targeted metric, they do not necessarily improve results on other bias metrics. We demonstrate that combinations of statistical and causal debiasing techniques are able to reduce bias measured through both types of metrics.
Anthology ID:
2024.findings-naacl.38
Volume:
Findings of the Association for Computational Linguistics: NAACL 2024
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
561–582
Language:
URL:
https://aclanthology.org/2024.findings-naacl.38
DOI:
Bibkey:
Cite (ACL):
Hannah Chen, Yangfeng Ji, and David Evans. 2024. Addressing Both Statistical and Causal Gender Fairness in NLP Models. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 561–582, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Addressing Both Statistical and Causal Gender Fairness in NLP Models (Chen et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-naacl.38.pdf
Copyright:
 2024.findings-naacl.38.copyright.pdf