Youngbin Choi

2025

Personalizing large language models (LLMs) is important for aligning outputs with diverse user preferences, yet existing methods struggle with flexibility and generalization. We propose CoPL (Collaborative Preference Learning), a graph-based collaborative filtering framework that models user-response relationships to enhance preference estimation, particularly in sparse annotation settings. By integrating a mixture of LoRA experts, CoPL efficiently fine-tunes LLMs while dynamically balancing shared and user-specific preferences. Additionally, an optimization-free adaptation strategy enables generalization to unseen users without fine-tuning. Experiments on TL;DR, UltraFeedback-P, and PersonalLLM datasets demonstrate that CoPL outperforms existing personalized reward models, effectively capturing both common and controversial preferences, making it a scalable solution for personalized LLM alignment.

pdf bib abs
ChronoBias: A Benchmark for Evaluating Time-conditional Group Bias in the Time-sensitive Knowledge of Large Language Models
Kyungmin Kim | Youngbin Choi | Hyounghun Kim | Dongwoo Kim | Sangdon Park
Findings of the Association for Computational Linguistics: EMNLP 2025

In this paper, we propose ChronoBias, a novel benchmark for evaluating time-conditional group bias in the time-sensitive knowledge of large language models (LLMs).Our benchmark is constructed via a template-based semi-automated generation method, balancing the quality-quantity trade-off in existing benchmark curation approaches.For knowledge that changes over time, time-conditional group bias exhibits varying patterns across time intervals, evident in both the best- and worst-performing groups and in the bias metric itself.In addition to parametric knowledge bias–which influences group bias across all time intervals–we identify time-sensitivity bias as an additional factor after a model’s knowledge cutoff, accounting for much of the variation in time-conditional group bias over time.Since both biases are irreducible, retrieval-augmented generation (RAG) can be a promising approach, as it can address post-cutoff knowledge and better leverage pretraining knowledge that is underrepresented in the model parameters.While RAG improves both overall performance and group bias, we observe that the disparate patterns of time-conditional group bias still persist.Therefore, through extensive experiments with various model configurations, we illustrate how accurate and fair RAG-based LLMs should behave and provide actionable guidelines toward constructing such ideal models.

We introduce GeoDANO, a geometric vision-language model (VLM) with a domain-agnostic vision encoder, for solving plane geometry problems. Although VLMs have been employed for solving geometry problems, their ability to recognize geometric features remains insufficiently analyzed. To address this gap, we propose a benchmark that evaluates the recognition of visual geometric features, including primitives such as dots and lines, and relations such as orthogonality. Our preliminary study shows that vision encoders often used in general-purpose VLMs, e.g., OpenCLIP, fail to detect these features and struggle to generalize across domains. To overcome the limitation, we develop GeoCLIP, a CLIP-based model trained on synthetic geometric diagram–caption pairs. Benchmark results show that GeoCLIP outperforms existing vision encoders in recognizing geometric features. We then propose our VLM, GeoDANO, which augments GeoCLIP with a domain adaptation strategy for unseen diagram styles. GeoDANO outperforms specialized methods for plane geometry problems and GPT-4o on MathVerse. The implementation is available at https://github.com/ml-postech/GeoDANO.

Co-authors

Venues

findings2
emnlp1

Fix author