Christoph Dann
2024
Conditional Language Policy: A General Framework For Steerable Multi-Objective Finetuning
Kaiwen Wang
|
Rahul Kidambi
|
Ryan Sullivan
|
Alekh Agarwal
|
Christoph Dann
|
Andrea Michi
|
Marco Gelmi
|
Yunxuan Li
|
Raghav Gupta
|
Kumar Avinava Dubey
|
Alexandre Rame
|
Johan Ferret
|
Geoffrey Cideron
|
Le Hou
|
Hongkun Yu
|
Amr Ahmed
|
Aranyak Mehta
|
Leonard Hussenot
|
Olivier Bachem
|
Edouard Leurent
Findings of the Association for Computational Linguistics: EMNLP 2024
Reward-based finetuning is crucial for aligning language policies with intended behaviors (*e.g.*, creativity and safety). A key challenge is to develop steerable language models that trade-off multiple (conflicting) objectives in a flexible and efficient manner. This paper presents Conditional Language Policy (CLP), a general framework for finetuning language models on multiple objectives. Building on techniques from multi-task training and parameter-efficient finetuning, CLP learn steerable models that effectively trade-off conflicting objectives at *inference time*. Notably, this does not require training or maintaining multiple models to achieve different trade-offs between the objectives. Through extensive experiments and ablations on two summarization datasets, we show that CLP learns steerable language models that outperform and Pareto-dominate the existing approaches for multi-objective
Search