HyHTM: Hyperbolic Geometry-based Hierarchical Topic Model

Simra Shahid, Tanay Anand, Nikitha Srikanth, Sumit Bhatia, Balaji Krishnamurthy, Nikaash Puri


Abstract
Hierarchical Topic Models (HTMs) are useful for discovering topic hierarchies in a collection of documents. However, traditional HTMs often produce hierarchies where lower-level topics are unrelated and not specific enough to their higher-level topics. Additionally, these methods can be computationally expensive. We present HyHTM - a Hyperbolic geometry-based Hierarchical Topic Model - that addresses these limitations by incorporating hierarchical information from hyperbolic geometry to explicitly model hierarchies in topic models. Experimental results with four baselines show that HyHTM can better attend to parent-child relationships among topics. HyHTM produces coherent topic hierarchies that specialize in granularity from generic higher-level topics to specific lower-level topics. Further, our model is significantly faster and leaves a much smaller memory footprint than our best-performing baseline. We have made the source code for our algorithm publicly accessible.
Anthology ID:
2023.findings-acl.742
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11672–11688
Language:
URL:
https://aclanthology.org/2023.findings-acl.742
DOI:
10.18653/v1/2023.findings-acl.742
Bibkey:
Cite (ACL):
Simra Shahid, Tanay Anand, Nikitha Srikanth, Sumit Bhatia, Balaji Krishnamurthy, and Nikaash Puri. 2023. HyHTM: Hyperbolic Geometry-based Hierarchical Topic Model. In Findings of the Association for Computational Linguistics: ACL 2023, pages 11672–11688, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
HyHTM: Hyperbolic Geometry-based Hierarchical Topic Model (Shahid et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-acl.742.pdf