Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks

Amir Saeidi; Shivanshu Verma; Md Nayem Uddin; Chitta Baral

doi:10.18653/v1/2025.acl-srw.26

Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks

Amir Saeidi, Shivanshu Verma, Md Nayem Uddin, Chitta Baral

Abstract

This study evaluates Direct Preference Optimization (DPO) and its variants for aligning Large Language Models (LLMs) with human preferences, testing three configurations: (1) with Supervised Fine-Tuning (SFT), (2) without SFT, and (3) without SFT but using an instruction-tuned model. We further investigate how training set size influences model performance. Our evaluation spans 13 benchmarks—covering dialogue, reasoning, mathematical problem-solving, question answering, truthfulness, MT-Bench, Big Bench, and the Open LLM Leaderboard. We find that: (1) alignment methods often achieve near-optimal performance even with smaller subsets of training data; (2) although they offer limited improvements on complex reasoning tasks, they enhance mathematical problem-solving; and (3) using an instruction-tuned model improves truthfulness. These insights highlight the conditions under which alignment methods excel, as well as their limitations.

Anthology ID:: 2025.acl-srw.26
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Jin Zhao, Mingyang Wang, Zhu Liu
Venues:: ACL | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 409–421
Language:
URL:: https://aclanthology.org/2025.acl-srw.26/
DOI:: 10.18653/v1/2025.acl-srw.26
Bibkey:
Cite (ACL):: Amir Saeidi, Shivanshu Verma, Md Nayem Uddin, and Chitta Baral. 2025. Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 409–421, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks (Saeidi et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-srw.26.pdf

PDF Cite Search Fix data