FastClass: A Time-Efficient Approach to Weakly-Supervised Text Classification

Tingyu Xia; Yue Wang; Yuan Tian; Yi Chang

doi:10.18653/v1/2022.emnlp-main.313

FastClass: A Time-Efficient Approach to Weakly-Supervised Text Classification

Tingyu Xia, Yue Wang, Yuan Tian, Yi Chang

Abstract

Weakly-supervised text classification aims to train a classifier using only class descriptions and unlabeled data. Recent research shows that keyword-driven methods can achieve state-of-the-art performance on various tasks. However, these methods not only rely on carefully-crafted class descriptions to obtain class-specific keywords but also require substantial amount of unlabeled data and takes a long time to train. This paper proposes FastClass, an efficient weakly-supervised classification approach. It uses dense text representation to retrieve class-relevant documents from external unlabeled corpus and selects an optimal subset to train a classifier. Compared to keyword-driven methods, our approach is less reliant on initial class descriptions as it no longer needs to expand each class description into a set of class-specific keywords.Experiments on a wide range of classification tasks show that the proposed approach frequently outperforms keyword-driven models in terms of classification accuracy and often enjoys orders-of-magnitude faster training speed.

Anthology ID:: 2022.emnlp-main.313
Original:: 2022.emnlp-main.313v1
Version 2:: 2022.emnlp-main.313v2
Volume:: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2022
Address:: Abu Dhabi, United Arab Emirates
Editors:: Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4746–4758
Language:
URL:: https://aclanthology.org/2022.emnlp-main.313
DOI:: 10.18653/v1/2022.emnlp-main.313
Bibkey:
Cite (ACL):: Tingyu Xia, Yue Wang, Yuan Tian, and Yi Chang. 2022. FastClass: A Time-Efficient Approach to Weakly-Supervised Text Classification. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 4746–4758, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):: FastClass: A Time-Efficient Approach to Weakly-Supervised Text Classification (Xia et al., EMNLP 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.emnlp-main.313.pdf
Software:: 2022.emnlp-main.313.software.zip

PDF (v2) PDF (v1) Cite Search Software