Do LLMs Exhibit Human-like Response Biases? A Case Study in Survey Design

Lindia Tjuatja, Valerie Chen, Tongshuang Wu, Ameet Talwalkwar, Graham Neubig


Abstract
One widely cited barrier to the adoption of LLMs as proxies for humans in subjective tasks is their sensitivity to prompt wording—but interestingly, humans also display sensitivities to instruction changes in the form of response biases. We investigate the extent to which LLMs reflect human response biases, if at all. We look to survey design, where human response biases caused by changes in the wordings of “prompts” have been extensively explored in social psychology literature. Drawing from these works, we design a dataset and framework to evaluate whether LLMs exhibit human-like response biases in survey questionnaires. Our comprehensive evaluation of nine models shows that popular open and commercial LLMs generally fail to reflect human-like behavior, particularly in models that have undergone RLHF. Furthermore, even if a model shows a significant change in the same direction as humans, we find that they are sensitive to perturbations that do not elicit significant changes in humans. These results highlight the pitfalls of using LLMs as human proxies, and underscore the need for finer-grained characterizations of model behavior.1
Anthology ID:
2024.tacl-1.56
Volume:
Transactions of the Association for Computational Linguistics, Volume 12
Month:
Year:
2024
Address:
Cambridge, MA
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
1011–1026
Language:
URL:
https://aclanthology.org/2024.tacl-1.56
DOI:
10.1162/tacl_a_00685
Bibkey:
Cite (ACL):
Lindia Tjuatja, Valerie Chen, Tongshuang Wu, Ameet Talwalkwar, and Graham Neubig. 2024. Do LLMs Exhibit Human-like Response Biases? A Case Study in Survey Design. Transactions of the Association for Computational Linguistics, 12:1011–1026.
Cite (Informal):
Do LLMs Exhibit Human-like Response Biases? A Case Study in Survey Design (Tjuatja et al., TACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.tacl-1.56.pdf