Learning Geometry-Aware Representations for New Intent Discovery

Kai Tang, Junbo Zhao, Xiao Ding, Runze Wu, Lei Feng, Gang Chen, Haobo Wang


Abstract
New intent discovery (NID) is an important problem for deploying practical dialogue systems, which trains intent classifiers on a semi-supervised corpus where unlabeled user utterances contain both known and novel intents. Most existing NID algorithms place hope on the sample similarity to cluster unlabeled corpus to known or new samples. Lacking supervision on new intents, we experimentally find the intent classifier fails to fully distinguish new intents since they tend to assemble into intertwined centers.To address this problem, we propose a novel GeoID framework that learns geometry-aware representations to maximally separate all intents. Specifically, we are motivated by the recent findings on Neural Collapse (NC) in classification tasks to derive optimal intent center structure. Meanwhile, we devise a dual pseudo-labeling strategy based on optimal transport assignments and semi-supervised clustering, ensuring proper utterances-to-center arrangement.Extensive results show that our GeoID method establishes a new state-of-the-art performance, achieving a +3.49% average accuracy improvement on three standardized benchmarking datasets. We also verify its usefulness in assisting large language models for improved in-context performance.
Anthology ID:
2024.luhme-long.306
Volume:
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5641–5654
Language:
URL:
https://aclanthology.org/2024.luhme-long.306/
DOI:
10.18653/v1/2024.acl-long.306
Bibkey:
Cite (ACL):
Kai Tang, Junbo Zhao, Xiao Ding, Runze Wu, Lei Feng, Gang Chen, and Haobo Wang. 2024. Learning Geometry-Aware Representations for New Intent Discovery. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5641–5654, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Learning Geometry-Aware Representations for New Intent Discovery (Tang et al., ACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.acl-long.306.pdf