Uncovering Stereotypes in Large Language Models: A Task Complexity-based Approach

Hari Shrawgi; Prasanjit Rath; Tushar Singhal; Sandipan Dandapat

Uncovering Stereotypes in Large Language Models: A Task Complexity-based Approach

Hari Shrawgi, Prasanjit Rath, Tushar Singhal, Sandipan Dandapat

Abstract

Recent Large Language Models (LLMs) have unlocked unprecedented applications of AI. As these models continue to transform human life, there are growing socio-ethical concerns around their inherent stereotypes that can lead to bias in their applications. There is an urgent need for holistic bias evaluation of these LLMs. Few such benchmarks exist today and evaluation techniques that do exist are either non-holistic or may provide a false sense of security as LLMs become better at hiding their biases on simpler tasks. We address these issues with an extensible benchmark - LLM Stereotype Index (LSI). LSI is grounded on Social Progress Index, a holistic social benchmark. We also test the breadth and depth of bias protection provided by LLMs via a variety of tasks with varying complexities. Our findings show that both ChatGPT and GPT-4 have strong inherent prejudice with respect to nationality, gender, race, and religion. The exhibition of such issues becomes increasingly apparent as we increase task complexity. Furthermore, GPT-4 is better at hiding the biases, but when displayed it is more significant. Our findings highlight the harms and divide that these LLMs can bring to society if we do not take very diligent care in their use.

Anthology ID:: 2024.eacl-long.111
Volume:: Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: March
Year:: 2024
Address:: St. Julian’s, Malta
Editors:: Yvette Graham, Matthew Purver
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1841–1857
Language:
URL:: https://aclanthology.org/2024.eacl-long.111
DOI:
Bibkey:
Cite (ACL):: Hari Shrawgi, Prasanjit Rath, Tushar Singhal, and Sandipan Dandapat. 2024. Uncovering Stereotypes in Large Language Models: A Task Complexity-based Approach. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1841–1857, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):: Uncovering Stereotypes in Large Language Models: A Task Complexity-based Approach (Shrawgi et al., EACL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.eacl-long.111.pdf
Software:: 2024.eacl-long.111.software.zip
Note:: 2024.eacl-long.111.note.zip
Video:: https://aclanthology.org/2024.eacl-long.111.mp4

PDF Cite Search Software Note Video