Incremental Few-shot Text Classification with Multi-round New Classes: Formulation, Dataset and System

Congying Xia, Wenpeng Yin, Yihao Feng, Philip Yu


Abstract
Text classification is usually studied by labeling natural language texts with relevant categories from a predefined set. In the real world, new classes might keep challenging the existing system with limited labeled data. The system should be intelligent enough to recognize upcoming new classes with a few examples. In this work, we define a new task in the NLP domain, incremental few-shot text classification, where the system incrementally handles multiple rounds of new classes. For each round, there is a batch of new classes with a few labeled examples per class. Two major challenges exist in this new task: (i) For the learning process, the system should incrementally learn new classes round by round without re-training on the examples of preceding classes; (ii) For the performance, the system should perform well on new classes without much loss on preceding classes. In addition to formulating the new task, we also release two benchmark datasets in the incremental few-shot setting: intent classification and relation classification. Moreover, we propose two entailment approaches, ENTAILMENT and HYBRID, which show promise for solving this novel problem.
Anthology ID:
2021.naacl-main.106
Volume:
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
June
Year:
2021
Address:
Online
Editors:
Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, Yichao Zhou
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1351–1360
Language:
URL:
https://aclanthology.org/2021.naacl-main.106
DOI:
10.18653/v1/2021.naacl-main.106
Bibkey:
Cite (ACL):
Congying Xia, Wenpeng Yin, Yihao Feng, and Philip Yu. 2021. Incremental Few-shot Text Classification with Multi-round New Classes: Formulation, Dataset and System. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1351–1360, Online. Association for Computational Linguistics.
Cite (Informal):
Incremental Few-shot Text Classification with Multi-round New Classes: Formulation, Dataset and System (Xia et al., NAACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.naacl-main.106.pdf
Video:
 https://aclanthology.org/2021.naacl-main.106.mp4