CATCH: A Controllable Theme Detection Framework with Contextualized Clustering and Hierarchical Generation

Rui Ke, Jiahui Xu, Kuang Wang, Shenghao Yang, Feng Jiang, Haizhou Li


Abstract
Theme detection is a fundamental task in user-centric dialogue systems, aiming to identify the latent topic of each utterance without relying on predefined schemas. Unlike intent induction, which operates within fixed label spaces, theme detection requires cross-dialogue consistency and alignment with personalized user preferences, posing significant challenges. Existing methods often struggle with sparse, short utterances and fail to capture user-level thematic preferences across dialogues. To address these challenges, we propose CATCH (Controllable Theme Detection with Contextualized Clustering and Hierarchical Generation), a unified framework that integrates three core components: (1) context-aware topic representation, which enriches utterance-level semantics using surrounding topic segments; (2) preference-guided topic clustering, which jointly models semantic proximity and personalized feedback to align themes across conversations; and (3) a hierarchical theme generation mechanism designed to suppress noise and produce robust, coherent topic labels. Experiments on a multi-domain customer dialogue benchmark demonstrate that CATCH achieves state-of-the-art performance in both theme classification and topic distribution quality. Notably, it ranked second in the official blind evaluation of the DSTC-12 Controllable Theme Detection Track, showcasing its effectiveness and generalizability in real-world dialogue systems.
Anthology ID:
2025.dstc-1.2
Volume:
Proceedings of the Twelfth Dialog System Technology Challenge
Month:
August
Year:
2025
Address:
Avignon, France
Editors:
Behnam Hedayatnia, Vivian Chen, Zhang Chen, Raghav Gupta, Michel Galley
Venues:
DSTC | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
17–26
Language:
URL:
https://aclanthology.org/2025.dstc-1.2/
DOI:
Bibkey:
Cite (ACL):
Rui Ke, Jiahui Xu, Kuang Wang, Shenghao Yang, Feng Jiang, and Haizhou Li. 2025. CATCH: A Controllable Theme Detection Framework with Contextualized Clustering and Hierarchical Generation. In Proceedings of the Twelfth Dialog System Technology Challenge, pages 17–26, Avignon, France. Association for Computational Linguistics.
Cite (Informal):
CATCH: A Controllable Theme Detection Framework with Contextualized Clustering and Hierarchical Generation (Ke et al., DSTC 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.dstc-1.2.pdf