Drift: Decoding-time Personalized Alignments with Implicit User Preferences

Minbeom Kim; Kang-il Lee; Seongho Joo; Hwaran Lee; Thibaut Thonet; Kyomin Jung

Drift: Decoding-time Personalized Alignments with Implicit User Preferences

Minbeom Kim, Kang-il Lee, Seongho Joo, Hwaran Lee, Thibaut Thonet, Kyomin Jung

Abstract

Personalized alignments towards individual users have been a long-standing goal in large language models (LLMs). We introduce Drift, a novel framework that personalizes LLMs at decoding time with implicit user preferences. Unlike traditional Reinforcement Learning from Human Feedback (RLHF), which relies on vast annotated datasets and expensive gradient updates, Drift operates in a training-free manner by steering a frozen LLM through few-shot preference modeling. Our approach represents user preferences as a composition of interpretable and predefined attributes, and employs a zero-shot rewarding mechanism based on contrastive system prompts. Experiments on both a synthetic persona dataset Perspective and a real human-annotated dataset PRISM demonstrate that Drift achieves performance comparable to standard RLHF methods while using only 50–100 examples. Our results show that Drift delivers not only computationally efficient but also interpretable personalization.

Anthology ID:: 2025.findings-emnlp.324
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6107–6126
Language:
URL:: https://aclanthology.org/2025.findings-emnlp.324/
DOI:
Bibkey:
Cite (ACL):: Minbeom Kim, Kang-il Lee, Seongho Joo, Hwaran Lee, Thibaut Thonet, and Kyomin Jung. 2025. Drift: Decoding-time Personalized Alignments with Implicit User Preferences. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 6107–6126, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Drift: Decoding-time Personalized Alignments with Implicit User Preferences (Kim et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-emnlp.324.pdf
Checklist:: 2025.findings-emnlp.324.checklist.pdf

PDF Cite Search Checklist Fix data