vONTSS: vMF based semi-supervised neural topic modeling with optimal transport

Weijie Xu, Xiaoyu Jiang, Srinivasan Sengamedu Hanumantha Rao, Francis Iannacci, Jinjin Zhao


Abstract
Recently, Neural Topic Models (NTM), inspired by variational autoencoders, have attracted a lot of research interest; however, these methods have limited applications in the real world due to the challenge of incorporating human knowledge. This work presents a semi-supervised neural topic modeling method, vONTSS, which uses von Mises-Fisher (vMF) based variational autoencoders and optimal transport. When a few keywords per topic are provided, vONTSS in the semi-supervised setting generates potential topics and optimizes topic-keyword quality and topic classification. Experiments show that vONTSS outperforms existing semi-supervised topic modeling methods in classification accuracy and diversity. vONTSS also supports unsupervised topic modeling. Quantitative and qualitative experiments show that vONTSS in the unsupervised setting outperforms recent NTMs on multiple aspects: vONTSS discovers highly clustered and coherent topics on benchmark datasets. It is also much faster than the state-of-the-art weakly supervised text classification method while achieving similar classification performance. We further prove the equivalence of optimal transport loss and cross-entropy loss at the global minimum.
Anthology ID:
2023.findings-acl.271
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4433–4457
Language:
URL:
https://aclanthology.org/2023.findings-acl.271
DOI:
10.18653/v1/2023.findings-acl.271
Bibkey:
Cite (ACL):
Weijie Xu, Xiaoyu Jiang, Srinivasan Sengamedu Hanumantha Rao, Francis Iannacci, and Jinjin Zhao. 2023. vONTSS: vMF based semi-supervised neural topic modeling with optimal transport. In Findings of the Association for Computational Linguistics: ACL 2023, pages 4433–4457, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
vONTSS: vMF based semi-supervised neural topic modeling with optimal transport (Xu et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-acl.271.pdf