Rajesh Ranganath
2025
Learning Is Not A Race: Improving Retrieval in Language Models via Equal Learning
Wanqian Yang
|
Aahlad Manas Puli
|
Rajesh Ranganath
Findings of the Association for Computational Linguistics: EMNLP 2025
Many applications that modern large language models (LLMs) are deployed on are retrieval tasks: the answer can be recovered from context and success is a matter of learning generalizable features from data. However, this is easier said than done. Overparametrized models trained on cross-entropy loss can overfit on noise. We argue that such overfitting is prone to happen when the model can identify mechanisms that rapidly drive down the loss of certain tokens early on in training. Fitting some tokens early reduce gradient signals in later iterations, as such, remaining tokens are more vulnerable to noise overfitting. We dub this phenomenon unequal learning and show that LLMs with longer contexts or larger embedding sizes are prone to this failure mode. In this work, we argue that learning training samples at an equal rate helps counter such biases. We highlight two mechanisms that promote equal learning: (i) loss functions that regularize uniform margins across training samples, (ii) small learning rates (e.g. by warming up) at the start of training. We demonstrate these approaches on various synthetic and natural language datasets.
2009
It’s Not You, it’s Me: Detecting Flirting and its Misperception in Speed-Dates
Rajesh Ranganath
|
Dan Jurafsky
|
Dan McFarland
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing
Extracting Social Meaning: Identifying Interactional Style in Spoken Conversation
Dan Jurafsky
|
Rajesh Ranganath
|
Dan McFarland
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics