SAE-NTM: Sentence-Aware Encoder for Neural Topic Modeling

Hao Liu, Jingsheng Gao, Suncheng Xiang, Ting Liu, Yuzhuo Fu


Abstract
Incorporating external knowledge, such as pre-trained language models (PLMs), into neural topic modeling has achieved great success in recent years. However, employing PLMs for topic modeling generally ignores the maximum sequence length of PLMs and the interaction between external knowledge and bag-of-words (BOW). To this end, we propose a sentence-aware encoder for neural topic modeling, which adopts fine-grained sentence embeddings as external knowledge to entirely utilize the semantic information of input documents. We introduce sentence-aware attention for document representation, where BOW enables the model to attend on topical sentences that convey topic-related cues. Experiments on three benchmark datasets show that our framework outperforms other state-of-the-art neural topic models in topic coherence. Further, we demonstrate that the proposed approach can yield better latent document-topic features through improvement on the document classification.
Anthology ID:
2023.codi-1.14
Volume:
Proceedings of the 4th Workshop on Computational Approaches to Discourse (CODI 2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Michael Strube, Chloe Braud, Christian Hardmeier, Junyi Jessy Li, Sharid Loaiciga, Amir Zeldes
Venue:
CODI
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
106–111
Language:
URL:
https://aclanthology.org/2023.codi-1.14
DOI:
10.18653/v1/2023.codi-1.14
Bibkey:
Cite (ACL):
Hao Liu, Jingsheng Gao, Suncheng Xiang, Ting Liu, and Yuzhuo Fu. 2023. SAE-NTM: Sentence-Aware Encoder for Neural Topic Modeling. In Proceedings of the 4th Workshop on Computational Approaches to Discourse (CODI 2023), pages 106–111, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
SAE-NTM: Sentence-Aware Encoder for Neural Topic Modeling (Liu et al., CODI 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.codi-1.14.pdf