Biases in Large Language Model-Elicited Text: A Case Study in Natural Language Inference

Grace Proebsting; Adam Poliak

Biases in Large Language Model-Elicited Text: A Case Study in Natural Language Inference

Abstract

We test whether NLP datasets created with Large Language Models (LLMs) contain annotation artifacts and social biases like NLP datasets elicited from crowd-source workers. We recreate a portion of the Stanford Natural Language Inference corpus using GPT-4, Llama-2 70b for Chat, and Mistral 7b Instruct. We train hypothesis-only classifiers to determine whether LLM-elicited NLI datasets contain annotation artifacts. Next, we use point-wise mutual information to identify the words in each dataset that are associated with gender, race, and age-related terms. On our LLM-generated NLI datasets, fine-tuned BERT hypothesis-only classifiers achieve between 86-96% accuracy. Our analyses further characterize the annotation artifacts and stereotypical biases in LLM-generated datasets.

Anthology ID:: 2025.coling-main.389
Volume:: Proceedings of the 31st International Conference on Computational Linguistics
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:: COLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5836–5851
Language:
URL:: https://aclanthology.org/2025.coling-main.389/
DOI:
Bibkey:
Cite (ACL):: Grace Proebsting and Adam Poliak. 2025. Biases in Large Language Model-Elicited Text: A Case Study in Natural Language Inference. In Proceedings of the 31st International Conference on Computational Linguistics, pages 5836–5851, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: Biases in Large Language Model-Elicited Text: A Case Study in Natural Language Inference (Proebsting & Poliak, COLING 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.coling-main.389.pdf

PDF Cite Search Fix data