Coordinated Topic Modeling

Pritom Saha Akash, Jie Huang, Kevin Chen-Chuan Chang


Abstract
We propose a new problem called coordinated topic modeling that imitates human behavior while describing a text corpus. It considers a set of well-defined topics like the axes of a semantic space with a reference representation. It then uses the axes to model a corpus for easily understandable representation. This new task helps represent a corpus more interpretably by reusing existing knowledge and benefits the corpora comparison task. We design ECTM, an embedding-based coordinated topic model that effectively uses the reference representation to capture the target corpus-specific aspects while maintaining each topic’s global semantics. In ECTM, we introduce the topic- and document-level supervision with a self-training mechanism to solve the problem. Finally, extensive experiments on multiple domains show the superiority of our model over other baselines.
Anthology ID:
2022.emnlp-main.668
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9831–9843
Language:
URL:
https://aclanthology.org/2022.emnlp-main.668
DOI:
10.18653/v1/2022.emnlp-main.668
Bibkey:
Cite (ACL):
Pritom Saha Akash, Jie Huang, and Kevin Chen-Chuan Chang. 2022. Coordinated Topic Modeling. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9831–9843, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Coordinated Topic Modeling (Akash et al., EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-main.668.pdf