NSP-BERT: A Prompt-based Few-Shot Learner through an Original Pre-training Task —— Next Sentence Prediction

Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu


Abstract
Using prompts to utilize language models to perform various downstream tasks, also known as prompt-based learning or prompt-learning, has lately gained significant success in comparison to the pre-train and fine-tune paradigm. Nonetheless, virtually most prompt-based methods are token-level such as PET based on mask language model (MLM). In this paper, we attempt to accomplish several NLP tasks in the zero-shot and few-shot scenarios using a BERT original pre-training task abandoned by RoBERTa and other models——Next Sentence Prediction (NSP). Unlike token-level techniques, our sentence-level prompt-based method NSP-BERT does not need to fix the length of the prompt or the position to be predicted, allowing it to handle tasks such as entity linking with ease. NSP-BERT can be applied to a variety of tasks based on its properties. We present an NSP-tuning approach with binary cross-entropy loss for single-sentence classification tasks that is competitive compared to PET and EFL. By continuing to train BERT on RoBERTa’s corpus, the model’s performance improved significantly, which indicates that the pre-training corpus is another important determinant of few-shot besides model size and prompt method.
Anthology ID:
2022.coling-1.286
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
3233–3250
Language:
URL:
https://aclanthology.org/2022.coling-1.286
DOI:
Bibkey:
Cite (ACL):
Yi Sun, Yu Zheng, Chao Hao, and Hangping Qiu. 2022. NSP-BERT: A Prompt-based Few-Shot Learner through an Original Pre-training Task —— Next Sentence Prediction. In Proceedings of the 29th International Conference on Computational Linguistics, pages 3233–3250, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
NSP-BERT: A Prompt-based Few-Shot Learner through an Original Pre-training Task —— Next Sentence Prediction (Sun et al., COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.286.pdf
Code
 sunyilgdx/prompts4keras
Data
AG NewsCLUEChIDFewCLUEGLUEMPQA Opinion CorpusMultiNLIOCNLIQNLISNLISST