Statistically Profiling Biases in Natural Language Reasoning Datasets and Models

Shanshan Huang; Kenny Zhu

doi:10.18653/v1/2023.findings-emnlp.299

Statistically Profiling Biases in Natural Language Reasoning Datasets and Models

Abstract

Recent studies have shown that many natural language understanding and reasoning datasets contain statistical cues that can be exploited by NLP models, resulting in an overestimation of their capabilities. Existing methods, such as “hypothesis-only” tests and CheckList, are limited in identifying these cues and evaluating model weaknesses. We introduce ICQ (I-See-Cue), a lightweight, general statistical profiling framework that automatically identifies potential biases in multiple-choice NLU datasets without requiring additional test cases. ICQ assesses the extent to which models exploit these biases through black-box testing, addressing the limitations of current methods. In this work, we conduct a comprehensive evaluation of statistical biases in 10 popular NLU datasets and 4 models, confirming prior findings, revealing new insights, and offering an online demonstration system to encourage users to assess their own datasets and models. Furthermore, we present a case study on investigating ChatGPT’s bias, providing valuable recommendations for practical applications.

Anthology ID:: 2023.findings-emnlp.299
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2023
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4521–4530
Language:
URL:: https://aclanthology.org/2023.findings-emnlp.299
DOI:: 10.18653/v1/2023.findings-emnlp.299
Bibkey:
Cite (ACL):: Shanshan Huang and Kenny Zhu. 2023. Statistically Profiling Biases in Natural Language Reasoning Datasets and Models. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 4521–4530, Singapore. Association for Computational Linguistics.
Cite (Informal):: Statistically Profiling Biases in Natural Language Reasoning Datasets and Models (Huang & Zhu, Findings 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.findings-emnlp.299.pdf

PDF Cite Search