Jinggui Liang


2024

pdf bib
Actively Learn from LLMs with Uncertainty Propagation for Generalized Category Discovery
Jinggui Liang | Lizi Liao | Hao Fei | Bobo Li | Jing Jiang
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Generalized category discovery faces a key issue: the lack of supervision for new and unseen data categories. Traditional methods typically combine supervised pretraining with self-supervised learning to create models, and then employ clustering for category identification. However, these approaches tend to become overly tailored to known categories, failing to fully resolve the core issue. Hence, we propose to integrate the feedback from LLMs into an active learning paradigm. Specifically, our method innovatively employs uncertainty propagation to select data samples from high-uncertainty regions, which are then labeled using LLMs through a comparison-based prompting scheme. This not only eases the labeling task but also enhances accuracy in identifying new categories. Additionally, a soft feedback propagation mechanism is introduced to minimize the spread of inaccurate feedback. Experiments on various datasets demonstrate our framework’s efficacy and generalizability, significantly improving baseline models at a nominal average cost.

2023

pdf bib
ClusterPrompt: Cluster Semantic Enhanced Prompt Learning for New Intent Discovery
Jinggui Liang | Lizi Liao
Findings of the Association for Computational Linguistics: EMNLP 2023

The discovery of new intent categories from user utterances is a crucial task in expanding agent skills. The key lies in how to efficiently solicit semantic evidence from utterances and properly transfer knowledge from existing intents to new intents. However, previous methods laid too much emphasis on relations among utterances or clusters for transfer learning, while paying less attention to the usage of semantics. As a result, these methods suffer from in-domain over-fitting and often generate meaningless new intent clusters due to data distortion. In this paper, we present a novel approach called Cluster Semantic Enhanced Prompt Learning (CsePL) for discovering new intents. Our method leverages two-level contrastive learning with label semantic alignment to learn meaningful representations of intent clusters. These learned intent representations are then utilized as soft prompt initializations for discriminating new intents, reducing the dominance of existing intents. Extensive experiments conducted on three public datasets demonstrate the superiority of our proposed method. It not only outperforms existing methods but also suggests meaningful intent labels and enables early detection of new intents.