2024
pdf
bib
abs
A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques
Megh Thakkar
|
Quentin Fournier
|
Matthew Riemer
|
Pin-Yu Chen
|
Amal Zouaq
|
Payel Das
|
Sarath Chandar
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models are first pre-trained on trillions of tokens and then instruction-tuned or aligned to specific preferences. While pre-training remains out of reach for most researchers due to the compute required, fine-tuning has become affordable thanks to parameter-efficient methods such as LoRA and QLoRA. Alignment is known to be sensitive to the many factors involved, including the quantity and quality of data, the alignment method, and the adapter rank. However, there has not yet been an extensive study of their effect on downstream performance. To address this gap, we conduct an in-depth investigation of the impact of popular choices for three crucial axes: (i) the alignment dataset (HH-RLHF and BeaverTails), (ii) the alignment technique (SFT and DPO), and (iii) the model (LLaMA-1, Vicuna-v1.3, Mistral-7b, and Mistral-7b-Instruct). Our extensive setup spanning over 300 experiments reveals consistent trends and unexpected findings. We observe how more informative data helps with preference alignment, cases where supervised fine-tuning outperforms preference optimization, and how aligning to a distinct preference boosts performance on downstream tasks. Through our in-depth analyses, we put forward key guidelines to help researchers perform more effective parameter-efficient LLM alignment.
2019
pdf
bib
abs
Recursive Routing Networks: Learning to Compose Modules for Language Understanding
Ignacio Cases
|
Clemens Rosenbaum
|
Matthew Riemer
|
Atticus Geiger
|
Tim Klinger
|
Alex Tamkin
|
Olivia Li
|
Sandhini Agarwal
|
Joshua D. Greene
|
Dan Jurafsky
|
Christopher Potts
|
Lauri Karttunen
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
We introduce Recursive Routing Networks (RRNs), which are modular, adaptable models that learn effectively in diverse environments. RRNs consist of a set of functions, typically organized into a grid, and a meta-learner decision-making component called the router. The model jointly optimizes the parameters of the functions and the meta-learner’s policy for routing inputs through those functions. RRNs can be incorporated into existing architectures in a number of ways; we explore adding them to word representation layers, recurrent network hidden layers, and classifier layers. Our evaluation task is natural language inference (NLI). Using the MultiNLI corpus, we show that an RRN’s routing decisions reflect the high-level genre structure of that corpus. To show that RRNs can learn to specialize to more fine-grained semantic distinctions, we introduce a new corpus of NLI examples involving implicative predicates, and show that the model components become fine-tuned to the inferential signatures that are characteristic of these predicates.
2015
pdf
bib
A Deep Learning and Knowledge Transfer Based Architecture for Social Media User Characteristic Determination
Matthew Riemer
|
Sophia Krasikov
|
Harini Srinivasan
Proceedings of the third International Workshop on Natural Language Processing for Social Media