Towards Identifying Social Bias in Dialog Systems: Framework, Dataset, and Benchmark

Jingyan Zhou, Jiawen Deng, Fei Mi, Yitong Li, Yasheng Wang, Minlie Huang, Xin Jiang, Qun Liu, Helen Meng


Abstract
Among all the safety concerns that hinder the deployment of open-domain dialog systems (e.g., offensive languages, biases, and toxic behaviors), social bias presents an insidious challenge. Addressing this challenge requires rigorous analyses and normative reasoning. In this paper, we focus our investigation on social bias measurement to facilitate the development of unbiased dialog systems. We first propose a novel Dial-Bias Framework for analyzing the social bias in conversations using a holistic method beyond bias lexicons or dichotomous annotations. Leveraging the proposed framework, we further introduce the CDial-Bias Dataset which is, to the best of our knowledge, the first annotated Chinese social bias dialog dataset. We also establish a fine-grained dialog bias measurement benchmark and conduct in-depth ablation studies to shed light on the utility of the detailed annotations in the proposed dataset. Finally, we evaluate representative Chinese generative models with our classifiers to unveil the presence of social bias in these systems.
Anthology ID:
2022.findings-emnlp.262
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2022
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3576–3591
Language:
URL:
https://aclanthology.org/2022.findings-emnlp.262
DOI:
10.18653/v1/2022.findings-emnlp.262
Bibkey:
Cite (ACL):
Jingyan Zhou, Jiawen Deng, Fei Mi, Yitong Li, Yasheng Wang, Minlie Huang, Xin Jiang, Qun Liu, and Helen Meng. 2022. Towards Identifying Social Bias in Dialog Systems: Framework, Dataset, and Benchmark. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 3576–3591, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Towards Identifying Social Bias in Dialog Systems: Framework, Dataset, and Benchmark (Zhou et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-emnlp.262.pdf