What to Pre-Train on? Efficient Intermediate Task Selection

Clifton Poth, Jonas Pfeiffer, Andreas Rücklé, Iryna Gurevych


Abstract
Intermediate task fine-tuning has been shown to culminate in large transfer gains across many NLP tasks. With an abundance of candidate datasets as well as pre-trained language models, it has become infeasible to experiment with all combinations to find the best transfer setting. In this work, we provide a comprehensive comparison of different methods for efficiently identifying beneficial tasks for intermediate transfer learning. We focus on parameter and computationally efficient adapter settings, highlight different data-availability scenarios, and provide expense estimates for each method. We experiment with a diverse set of 42 intermediate and 11 target English classification, multiple choice, question answering, and sequence tagging tasks. Our results demonstrate that efficient embedding based methods, which rely solely on the respective datasets, outperform computational expensive few-shot fine-tuning approaches. Our best methods achieve an average Regret@3 of 1% across all target tasks, demonstrating that we are able to efficiently identify the best datasets for intermediate training.
Anthology ID:
2021.emnlp-main.827
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10585–10605
Language:
URL:
https://aclanthology.org/2021.emnlp-main.827
DOI:
10.18653/v1/2021.emnlp-main.827
Bibkey:
Cite (ACL):
Clifton Poth, Jonas Pfeiffer, Andreas Rücklé, and Iryna Gurevych. 2021. What to Pre-Train on? Efficient Intermediate Task Selection. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10585–10605, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
What to Pre-Train on? Efficient Intermediate Task Selection (Poth et al., EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.827.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.827.mp4
Code
 adapter-hub/efficient-task-transfer
Data
BoolQCOPACoNLL-2003DROPGLUESuperGLUE