A Neural Generative Model for Joint Learning Topics and Topic-Specific Word Embeddings

Lixing Zhu, Yulan He, Deyu Zhou


Abstract
We propose a novel generative model to explore both local and global context for joint learning topics and topic-specific word embeddings. In particular, we assume that global latent topics are shared across documents, a word is generated by a hidden semantic vector encoding its contextual semantic meaning, and its context words are generated conditional on both the hidden semantic vector and global latent topics. Topics are trained jointly with the word embeddings. The trained model maps words to topic-dependent embeddings, which naturally addresses the issue of word polysemy. Experimental results show that the proposed model outperforms the word-level embedding methods in both word similarity evaluation and word sense disambiguation. Furthermore, the model also extracts more coherent topics compared with existing neural topic models or other models for joint learning of topics and word embeddings. Finally, the model can be easily integrated with existing deep contextualized word embedding learning methods to further improve the performance of downstream tasks such as sentiment classification.
Anthology ID:
2020.tacl-1.31
Volume:
Transactions of the Association for Computational Linguistics, Volume 8
Month:
Year:
2020
Address:
Cambridge, MA
Editors:
Mark Johnson, Brian Roark, Ani Nenkova
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
471–485
Language:
URL:
https://aclanthology.org/2020.tacl-1.31
DOI:
10.1162/tacl_a_00326
Bibkey:
Cite (ACL):
Lixing Zhu, Yulan He, and Deyu Zhou. 2020. A Neural Generative Model for Joint Learning Topics and Topic-Specific Word Embeddings. Transactions of the Association for Computational Linguistics, 8:471–485.
Cite (Informal):
A Neural Generative Model for Joint Learning Topics and Topic-Specific Word Embeddings (Zhu et al., TACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.tacl-1.31.pdf