Quantifying Adaptability in Pre-trained Language Models with 500 Tasks

Belinda Li, Jane Yu, Madian Khabsa, Luke Zettlemoyer, Alon Halevy, Jacob Andreas


Abstract
When a neural language model (LM) is adapted to perform a new task, what aspects of the task predict the eventual performance of the model? In NLP, systematic features of LM generalization to individual examples are well characterized, but systematic aspects of LM adaptability to new tasks are not nearly as well understood. We present a large-scale empirical study of the features and limits of LM adaptability using a new benchmark, TaskBench500, built from 500 procedurally generated sequence modeling tasks. These tasks combine core aspects of language processing, including lexical semantics, sequence processing, memorization, logical reasoning, and world knowledge. Using TaskBench500, we evaluate three facets of adaptability, finding that: (1) adaptation procedures differ dramatically in their ability to memorize small datasets; (2) within a subset of task types, adaptation procedures exhibit compositional adaptability to complex tasks; and (3) failure to match training label distributions is explained by mismatches in the intrinsic difficulty of predicting individual labels. Our experiments show that adaptability to new tasks, like generalization to new examples, can be systematically described and understood, and we conclude with a discussion of additional aspects of adaptability that could be studied using the new benchmark.
Anthology ID:
2022.naacl-main.346
Volume:
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
July
Year:
2022
Address:
Seattle, United States
Editors:
Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4696–4715
Language:
URL:
https://aclanthology.org/2022.naacl-main.346
DOI:
10.18653/v1/2022.naacl-main.346
Bibkey:
Cite (ACL):
Belinda Li, Jane Yu, Madian Khabsa, Luke Zettlemoyer, Alon Halevy, and Jacob Andreas. 2022. Quantifying Adaptability in Pre-trained Language Models with 500 Tasks. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4696–4715, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
Quantifying Adaptability in Pre-trained Language Models with 500 Tasks (Li et al., NAACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.naacl-main.346.pdf
Video:
 https://aclanthology.org/2022.naacl-main.346.mp4
Code
 belindal/taskbench500 +  additional community code