Aligning Language Models to User Opinions

EunJeong Hwang, Bodhisattwa Majumder, Niket Tandon


Abstract
An important aspect of developing LLMs that interact with humans is to align models’ behavior to their users. It is possible to prompt an LLM into behaving as a certain persona, especially a user group or ideological persona the model captured during its pertaining stage. But, how to best align an LLM with a specific user and not a demographic or ideological group remains an open question. Mining public opinion surveys (by PEW research), we find that the opinions of a user and their demographics and ideologies are not mutual predictors. We use this insight to align LLMs by modeling relevant past user opinions in addition to user demographics and ideology, achieving up to 7 points accuracy gains in predicting public opinions from survey questions across a broad set of topics. Our work opens up the research avenues to bring user opinions as an important ingredient in aligning language models.
Anthology ID:
2023.findings-emnlp.393
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5906–5919
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.393
DOI:
10.18653/v1/2023.findings-emnlp.393
Bibkey:
Cite (ACL):
EunJeong Hwang, Bodhisattwa Majumder, and Niket Tandon. 2023. Aligning Language Models to User Opinions. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 5906–5919, Singapore. Association for Computational Linguistics.
Cite (Informal):
Aligning Language Models to User Opinions (Hwang et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.393.pdf