GenderAlign: An Alignment Dataset for Mitigating Gender Bias in Large Language Models

Tao Zhang; Ziqian Zeng; YuxiangXiao YuxiangXiao; Huiping Zhuang; Cen Chen; James R. Foulds; Shimei Pan

doi:10.18653/v1/2025.acl-long.553

GenderAlign: An Alignment Dataset for Mitigating Gender Bias in Large Language Models

Tao Zhang, Ziqian Zeng, YuxiangXiao YuxiangXiao, Huiping Zhuang, Cen Chen, James R. Foulds, Shimei Pan

Abstract

Large Language Models (LLMs) are prone to generating content that exhibits gender biases, raising significant ethical concerns. Alignment, the process of fine-tuning LLMs to better align with desired behaviors, is recognized as an effective approach to mitigate gender biases. Although proprietary LLMs have made significant strides in mitigating gender bias, their alignment datasets are not publicly available. The commonly used and publicly available alignment dataset, HH-RLHF, still exhibits gender bias to some extent. There is a lack of publicly available alignment datasets specifically designed to address gender bias. Hence, we developed a new dataset named GenderAlign, aiming at mitigating a comprehensive set of gender biases in LLMs. This dataset comprises 8k single-turn dialogues, each paired with a “chosen” and a “rejected” response. Compared to the “rejected” responses, the “chosen” responses demonstrate lower levels of gender bias and higher quality. Furthermore, we categorized the gender biases in the “rejected” responses of GenderAlign into 4 principal categories. The experimental results show the effectiveness of GenderAlign in reducing gender bias in LLMs.

Anthology ID:: 2025.acl-long.553
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 11293–11311
Language:
URL:: https://aclanthology.org/2025.acl-long.553/
DOI:: 10.18653/v1/2025.acl-long.553
Bibkey:
Cite (ACL):: Tao Zhang, Ziqian Zeng, YuxiangXiao YuxiangXiao, Huiping Zhuang, Cen Chen, James R. Foulds, and Shimei Pan. 2025. GenderAlign: An Alignment Dataset for Mitigating Gender Bias in Large Language Models. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 11293–11311, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: GenderAlign: An Alignment Dataset for Mitigating Gender Bias in Large Language Models (Zhang et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-long.553.pdf

PDF Cite Search Fix data