ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization

Hanwei Xu, Yujun Chen, Yulun Du, Nan Shao, Wang Yanggang, Haiyu Li, Zhilin Yang


Abstract
We propose a multitask pretraining approach ZeroPrompt for zero-shot generalization, focusing on task scaling and zero-shot prompting.While previous models are trained on only a few dozen tasks, we scale to 1,000 tasks for the first time using real-world data. This leads to a crucial discovery that task scaling can be an efficient alternative to model scaling; i.e., the model size has less impact on performance with an extremely large number of tasks. Our results show that task scaling can improve training efficiency by 30 times in FLOPs.Empirically, ZeroPrompt substantially improves both the efficiency and the performance of zero-shot learning across a variety of academic and production datasets.
Anthology ID:
2022.findings-emnlp.312
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2022
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4235–4252
Language:
URL:
https://aclanthology.org/2022.findings-emnlp.312
DOI:
10.18653/v1/2022.findings-emnlp.312
Bibkey:
Cite (ACL):
Hanwei Xu, Yujun Chen, Yulun Du, Nan Shao, Wang Yanggang, Haiyu Li, and Zhilin Yang. 2022. ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 4235–4252, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization (Xu et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-emnlp.312.pdf