SMASH: Improving SMAll Language Models’ Few-SHot Ability with Prompt-Based Distillation

Yueqian Wang, Chang Liu, Kai Chen, Xi Wang, Dongyan Zhao


Abstract
Large-scale language models coupled with prompts have shown remarkable performance on few-shot learning. However, through systematic experiments, we find that the few-shot performance of small language models is poor, and using prompts on them brings fewer improvements than on larger ones. In this paper, we propose SMASH, an approach to improve SMAll language models’ few-SHot ability by training on intermediate tasks before prompt-based fine-tuning on downstream tasks. We design intermediate tasks for sentence-pair tasks and sentiment classification tasks by creating training examples with prompt templates similar to downstream tasks using sentences sampled from a large-scale unsupervised corpus, and apply knowledge distillation to distill from outputs of larger pre-trained models as the training objective. We conduct extensive experiments and show that SMASH can make a 6-layer DistilRoBRETa-base achieve comparable performance on few-shot datasets with a 12-layer RoBERTa-base at a low cost.
Anthology ID:
2022.findings-emnlp.492
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2022
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6608–6619
Language:
URL:
https://aclanthology.org/2022.findings-emnlp.492
DOI:
10.18653/v1/2022.findings-emnlp.492
Bibkey:
Cite (ACL):
Yueqian Wang, Chang Liu, Kai Chen, Xi Wang, and Dongyan Zhao. 2022. SMASH: Improving SMAll Language Models’ Few-SHot Ability with Prompt-Based Distillation. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 6608–6619, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
SMASH: Improving SMAll Language Models’ Few-SHot Ability with Prompt-Based Distillation (Wang et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-emnlp.492.pdf