Pre-trained Token-replaced Detection Model as Few-shot Learner

Zicheng Li, Shoushan Li, Guodong Zhou


Abstract
Pre-trained masked language models have demonstrated remarkable ability as few-shot learners. In this paper, as an alternative, we propose a novel approach to few-shot learning with pre-trained token-replaced detection models like ELECTRA. In this approach, we reformulate a classification or a regression task as a token-replaced detection problem. Specifically, we first define a template and label description words for each task and put them into the input to form a natural language prompt. Then, we employ the pre-trained token-replaced detection model to predict which label description word is the most original (i.e., least replaced) among all label description words in the prompt. A systematic evaluation on 16 datasets demonstrates that our approach outperforms few-shot learners with pre-trained masked language models in both one-sentence and two-sentence learning tasks.
Anthology ID:
2022.coling-1.289
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
3274–3284
Language:
URL:
https://aclanthology.org/2022.coling-1.289
DOI:
Bibkey:
Cite (ACL):
Zicheng Li, Shoushan Li, and Guodong Zhou. 2022. Pre-trained Token-replaced Detection Model as Few-shot Learner. In Proceedings of the 29th International Conference on Computational Linguistics, pages 3274–3284, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
Pre-trained Token-replaced Detection Model as Few-shot Learner (Li et al., COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.289.pdf
Data
CoLAGLUEQNLISNLI