UNQOVERing Stereotyping Biases via Underspecified Questions

Tao Li, Daniel Khashabi, Tushar Khot, Ashish Sabharwal, Vivek Srikumar


Abstract
While language embeddings have been shown to have stereotyping biases, how these biases affect downstream question answering (QA) models remains unexplored. We present UNQOVER, a general framework to probe and quantify biases through underspecified questions. We show that a naive use of model scores can lead to incorrect bias estimates due to two forms of reasoning errors: positional dependence and question independence. We design a formalism that isolates the aforementioned errors. As case studies, we use this metric to analyze four important classes of stereotypes: gender, nationality, ethnicity, and religion. We probe five transformer-based QA models trained on two QA datasets, along with their underlying language models. Our broad study reveals that (1) all these models, with and without fine-tuning, have notable stereotyping biases in these classes; (2) larger models often have higher bias; and (3) the effect of fine-tuning on bias varies strongly with the dataset and the model size.
Anthology ID:
2020.findings-emnlp.311
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2020
Month:
November
Year:
2020
Address:
Online
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3475–3489
Language:
URL:
https://aclanthology.org/2020.findings-emnlp.311
DOI:
10.18653/v1/2020.findings-emnlp.311
Bibkey:
Cite (ACL):
Tao Li, Daniel Khashabi, Tushar Khot, Ashish Sabharwal, and Vivek Srikumar. 2020. UNQOVERing Stereotyping Biases via Underspecified Questions. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3475–3489, Online. Association for Computational Linguistics.
Cite (Informal):
UNQOVERing Stereotyping Biases via Underspecified Questions (Li et al., Findings 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.findings-emnlp.311.pdf
Code
 allenai/unqover
Data
NewsQASQuADStereoSet