On Curriculum Learning for Commonsense Reasoning

Adyasha Maharana, Mohit Bansal


Abstract
Commonsense reasoning tasks follow a standard paradigm of finetuning pretrained language models on the target task data, where samples are introduced to the model in a random order during training. However, recent research suggests that data order can have a significant impact on the performance of finetuned models for natural language understanding. Hence, we examine the effect of a human-like easy-to-difficult curriculum during finetuning of language models for commonsense reasoning tasks. We use paced curriculum learning to rank data and sample training mini-batches with increasing levels of difficulty from the ranked dataset during finetuning. Further, we investigate the effect of an adaptive curriculum, i.e., the data ranking is dynamically updated during training based on the current state of the learner model. We use a teacher model to measure difficulty of each sample and experiment with three measures based on question answering probability, variability and out-of-distribution. To understand the effectiveness of curriculum learning in various scenarios, we apply it on full model fine-tuning as well as parameter-efficient prompt-tuning settings. Our results show that fixed as well as adaptive curriculum learning significantly improve performance for five commonsense reasoning tasks, i.e., SocialIQA, CosmosQA, CODAH, HellaSwag, WinoGrande in both tuning settings. Further, we find that prioritizing the difficult samples in the tail end of training improves generalization to unseen in-domain data as well as out-of-domain data. Our work provides evidence and encourages research into curriculum learning for commonsense reasoning.
Anthology ID:
2022.naacl-main.72
Volume:
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
July
Year:
2022
Address:
Seattle, United States
Editors:
Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
983–992
Language:
URL:
https://aclanthology.org/2022.naacl-main.72
DOI:
10.18653/v1/2022.naacl-main.72
Bibkey:
Cite (ACL):
Adyasha Maharana and Mohit Bansal. 2022. On Curriculum Learning for Commonsense Reasoning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 983–992, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
On Curriculum Learning for Commonsense Reasoning (Maharana & Bansal, NAACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.naacl-main.72.pdf
Software:
 2022.naacl-main.72.software.zip
Video:
 https://aclanthology.org/2022.naacl-main.72.mp4
Code
 adymaharana/curriculum_learning
Data
CODAHCosmosQAGLUEHellaSwagQNLIWSCWinoGrande