Synergizing Large Language Models and Pre-Trained Smaller Models for Conversational Intent Discovery

Jinggui Liang, Lizi Liao, Hao Fei, Jing Jiang


Abstract
In Conversational Intent Discovery (CID), Small Language Models (SLMs) struggle with overfitting to familiar intents and fail to label newly discovered ones. This issue stems from their limited grasp of semantic nuances and their intrinsically discriminative framework. Therefore, we propose Synergizing Large Language Models (LLMs) with pre-trained SLMs for CID (SynCID). It harnesses the profound semantic comprehension of LLMs alongside the operational agility of SLMs. By utilizing LLMs to refine both utterances and existing intent labels, SynCID significantly enhances the semantic depth, subsequently realigning these enriched descriptors within the SLMs’ feature space to correct cluster distortion and promote robust learning of representations. A key advantage is its capacity for the early identification of new intents, a critical aspect for deploying conversational agents successfully. Additionally, SynCID leverages the in-context learning strengths of LLMs to generate labels for new intents. Thorough evaluations across a wide array of datasets have demonstrated its superior performance over traditional CID methods.
Anthology ID:
2024.findings-acl.840
Volume:
Findings of the Association for Computational Linguistics: ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14133–14147
Language:
URL:
https://aclanthology.org/2024.findings-acl.840
DOI:
10.18653/v1/2024.findings-acl.840
Bibkey:
Cite (ACL):
Jinggui Liang, Lizi Liao, Hao Fei, and Jing Jiang. 2024. Synergizing Large Language Models and Pre-Trained Smaller Models for Conversational Intent Discovery. In Findings of the Association for Computational Linguistics: ACL 2024, pages 14133–14147, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Synergizing Large Language Models and Pre-Trained Smaller Models for Conversational Intent Discovery (Liang et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.840.pdf