Sungjun Cho
2019
Practical Correlated Topic Modeling and Analysis via the Rectified Anchor Word Algorithm
Moontae Lee
|
Sungjun Cho
|
David Bindel
|
David Mimno
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Despite great scalability on large data and their ability to understand correlations between topics, spectral topic models have not been widely used due to the absence of reliability in real data and lack of practical implementations. This paper aims to solidify the foundations of spectral topic inference and provide a practical implementation for anchor-based topic modeling. Beginning with vocabulary curation, we scrutinize every single inference step with other viable options. We also evaluate our matrix-based approach against popular alternatives including a tensor-based spectral method as well as probabilistic algorithms. Our quantitative and qualitative experiments demonstrate the power of Rectified Anchor Word algorithm in various real datasets, providing a complete guide to practical correlated topic modeling.
Search