Investigating the Generalizability of Pretrained Language Models across Multiple Dimensions: A Case Study of NLI and MRC

Ritam Dutt, Sagnik Ray Choudhury, Varun Venkat Rao, Carolyn Rose, V.G.Vinod Vydiswaran


Abstract
Generalization refers to the ability of machine learning models to perform well on dataset distributions different from the one it was trained on. While several pre-existing works have characterized the generalizability of NLP models across different dimensions, such as domain shift, adversarial perturbations, or compositional variations, most studies were carried out in a stand-alone setting, emphasizing a single dimension of interest. We bridge this gap by systematically investigating the generalizability of pre-trained language models across different architectures, sizes, and training strategies, over multiple dimensions for the task of natural language inference and question answering. Our results indicate that model instances typically exhibit consistent generalization trends, i.e., they generalize equally well (or poorly) across most scenarios, and this ability is correlated with model architecture, base dataset performance, size, and training mechanism. We hope this research motivates further work in a) developing a multi-dimensional generalization benchmark for systematic evaluation and b) examining the reasons behind models’ generalization abilities. The code and data are available at https://github.com/sagnik/md-gen-nlp, and the trained models are released at https://huggingface.co/varun-v-rao.
Anthology ID:
2024.genbench-1.11
Volume:
Proceedings of the 2nd GenBench Workshop on Generalisation (Benchmarking) in NLP
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Dieuwke Hupkes, Verna Dankers, Khuyagbaatar Batsuren, Amirhossein Kazemnejad, Christos Christodoulopoulos, Mario Giulianelli, Ryan Cotterell
Venue:
GenBench
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
165–182
Language:
URL:
https://aclanthology.org/2024.genbench-1.11
DOI:
Bibkey:
Cite (ACL):
Ritam Dutt, Sagnik Ray Choudhury, Varun Venkat Rao, Carolyn Rose, and V.G.Vinod Vydiswaran. 2024. Investigating the Generalizability of Pretrained Language Models across Multiple Dimensions: A Case Study of NLI and MRC. In Proceedings of the 2nd GenBench Workshop on Generalisation (Benchmarking) in NLP, pages 165–182, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Investigating the Generalizability of Pretrained Language Models across Multiple Dimensions: A Case Study of NLI and MRC (Dutt et al., GenBench 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.genbench-1.11.pdf