On the Effect of Instruction Tuning Loss on Generalization

Anwoy Chatterjee; H. S. V. N. S. Kowndinya Renduchintala; Sumit Bhatia; Tanmoy Chakraborty

doi:10.1162/tacl.a.42

On the Effect of Instruction Tuning Loss on Generalization

Anwoy Chatterjee, H. S. V. N. S. Kowndinya Renduchintala, Sumit Bhatia, Tanmoy Chakraborty

Abstract

Instruction tuning has emerged as a pivotal post-training paradigm that enables pre-trained language models to better follow user instructions. Despite its significance, little attention has been given to optimizing the loss function used. A fundamental, yet often overlooked, question is whether the conventional auto-regressive objective—where loss is computed only on response tokens, excluding prompt tokens—is truly optimal for instruction tuning. In this work, we systematically investigate the impact of differentially weighting prompt and response tokens in instruction tuning loss, and propose Weighted Instruction Tuning (WIT) as a better alternative to conventional instruction tuning. Through extensive experiments on five language models of different families and scale, three finetuning datasets of different sizes, and five diverse evaluation benchmarks, we show that the standard instruction tuning loss often yields suboptimal performance and limited robustness to input prompt variations. We find that a low-to-moderate weight for prompt tokens coupled with a moderate-to-high weight for response tokens yields the best-performing models across settings and also serves as a better starting point for the subsequent preference alignment training. These findings highlight the need to reconsider instruction-tuning loss and offer actionable insights for developing more robust and generalizable models. Our code is open-sourced here.

Anthology ID:: 2025.tacl-1.62
Volume:: Transactions of the Association for Computational Linguistics, Volume 13
Month:
Year:: 2025
Address:: Cambridge, MA
Venue:: TACL
SIG:
Publisher:: MIT Press
Note:
Pages:: 1360–1380
Language:
URL:: https://aclanthology.org/2025.tacl-1.62/
DOI:: 10.1162/tacl.a.42
Bibkey:
Cite (ACL):: Anwoy Chatterjee, H. S. V. N. S. Kowndinya Renduchintala, Sumit Bhatia, and Tanmoy Chakraborty. 2025. On the Effect of Instruction Tuning Loss on Generalization. Transactions of the Association for Computational Linguistics, 13:1360–1380.
Cite (Informal):: On the Effect of Instruction Tuning Loss on Generalization (Chatterjee et al., TACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.tacl-1.62.pdf

PDF Cite Search Fix data