S2WTM: Spherical Sliced-Wasserstein Autoencoder for Topic Modeling

Suman Adhya; Debarshi Kumar Sanyal

doi:10.18653/v1/2025.acl-long.1131

S2WTM: Spherical Sliced-Wasserstein Autoencoder for Topic Modeling

Abstract

Modeling latent representations in a hyperspherical space has proven effective for capturing directional similarities in high-dimensional text data, benefiting topic modeling. Variational autoencoder-based neural topic models (VAE-NTMs) commonly adopt the von Mises-Fisher prior to encode hyperspherical structure. However, VAE-NTMs often suffer from posterior collapse, where the KL divergence term in the objective function highly diminishes, leading to ineffective latent representations. To mitigate this issue while modeling hyperspherical structure in the latent space, we propose the Spherical Sliced Wasserstein Autoencoder for Topic Modeling (S2WTM). S2WTM employs a prior distribution supported on the unit hypersphere and leverages the Spherical Sliced-Wasserstein distance to align the aggregated posterior distribution with the prior. Experimental results demonstrate that S2WTM outperforms state-of-the-art topic models, generating more coherent and diverse topics while improving performance on downstream tasks.

Anthology ID:: 2025.acl-long.1131
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 23211–23225
Language:
URL:: https://aclanthology.org/2025.acl-long.1131/
DOI:: 10.18653/v1/2025.acl-long.1131
Bibkey:
Cite (ACL):: Suman Adhya and Debarshi Kumar Sanyal. 2025. S2WTM: Spherical Sliced-Wasserstein Autoencoder for Topic Modeling. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 23211–23225, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: S2WTM: Spherical Sliced-Wasserstein Autoencoder for Topic Modeling (Adhya & Sanyal, ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-long.1131.pdf

PDF Cite Search Fix data