Aligning LLMs with Individual Preferences via Interaction

Shujin Wu; Yi R. Fung; Cheng Qian; Jeonghwan Kim; Dilek Hakkani-Tur; Heng Ji

Aligning LLMs with Individual Preferences via Interaction

Shujin Wu, Yi R. Fung, Cheng Qian, Jeonghwan Kim, Dilek Hakkani-Tur, Heng Ji

Abstract

As large language models (LLMs) demonstrate increasingly advanced capabilities, aligning their behaviors with human values and preferences becomes crucial for their wide adoption. While previous research focuses on general alignment to principles such as helpfulness, harmlessness, and honesty, the need to account for individual and diverse preferences has been largely overlooked, potentially undermining customized human experiences. To address this gap, we train LLMs that can “interact to align”, essentially cultivating the meta-skill of LLMs to implicitly infer the unspoken personalized preferences of the current user through multi-turn conversations, and then dynamically align their following behaviors and responses to these inferred preferences. Our approach involves establishing a diverse pool of 3,310 distinct user personas by initially creating seed examples, which are then expanded through iterative self-generation and filtering. Guided by distinct user personas, we leverage multi-LLM collaboration to develop a multi-turn preference dataset containing 3K+ multi-turn conversations in tree structures. Finally, we apply supervised fine-tuning and reinforcement learning to enhance LLMs using this dataset. For evaluation, we establish the ALOE (ALign with custOmized prEferences) benchmark, consisting of 100 carefully selected examples and well-designed metrics to measure the customized alignment performance during conversations. Experimental results demonstrate the effectiveness of our method in enabling dynamic, personalized alignment via interaction. The code and dataset will be made public.

Anthology ID:: 2025.coling-main.511
Volume:: Proceedings of the 31st International Conference on Computational Linguistics
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:: COLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7648–7662
Language:
URL:: https://aclanthology.org/2025.coling-main.511/
DOI:
Bibkey:
Cite (ACL):: Shujin Wu, Yi R. Fung, Cheng Qian, Jeonghwan Kim, Dilek Hakkani-Tur, and Heng Ji. 2025. Aligning LLMs with Individual Preferences via Interaction. In Proceedings of the 31st International Conference on Computational Linguistics, pages 7648–7662, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: Aligning LLMs with Individual Preferences via Interaction (Wu et al., COLING 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.coling-main.511.pdf

PDF Cite Search Fix data