STraTA: Self-Training with Task Augmentation for Better Few-shot Learning

Tu Vu, Minh-Thang Luong, Quoc Le, Grady Simon, Mohit Iyyer


Abstract
Despite their recent successes in tackling many NLP tasks, large-scale pre-trained language models do not perform as well in few-shot settings where only a handful of training examples are available. To address this shortcoming, we propose STraTA, which stands for Self-Training with Task Augmentation, an approach that builds on two key ideas for effective leverage of unlabeled data. First, STraTA uses task augmentation, a novel technique that synthesizes a large amount of data for auxiliary-task fine-tuning from target-task unlabeled texts. Second, STraTA performs self-training by further fine-tuning the strong base model created by task augmentation on a broad distribution of pseudo-labeled data. Our experiments demonstrate that STraTA can substantially improve sample efficiency across 12 few-shot benchmarks. Remarkably, on the SST-2 sentiment dataset, STraTA, with only 8 training examples per class, achieves comparable results to standard fine-tuning with 67K training examples. Our analyses reveal that task augmentation and self-training are both complementary and independently effective.
Anthology ID:
2021.emnlp-main.462
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5715–5731
Language:
URL:
https://aclanthology.org/2021.emnlp-main.462
DOI:
10.18653/v1/2021.emnlp-main.462
Bibkey:
Cite (ACL):
Tu Vu, Minh-Thang Luong, Quoc Le, Grady Simon, and Mohit Iyyer. 2021. STraTA: Self-Training with Task Augmentation for Better Few-shot Learning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5715–5731, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
STraTA: Self-Training with Task Augmentation for Better Few-shot Learning (Vu et al., EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.462.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.462.mp4
Code
 google-research/google-research
Data
GLUEMRPCMultiNLIQNLISICKSNLISSTSST-2SST-5