Word2Sense: Sparse Interpretable Word Embeddings

Abhishek Panigrahi, Harsha Vardhan Simhadri, Chiranjib Bhattacharyya


Abstract
We present an unsupervised method to generate Word2Sense word embeddings that are interpretable — each dimension of the embedding space corresponds to a fine-grained sense, and the non-negative value of the embedding along the j-th dimension represents the relevance of the j-th sense to the word. The underlying LDA-based generative model can be extended to refine the representation of a polysemous word in a short context, allowing us to use the embedings in contextual tasks. On computational NLP tasks, Word2Sense embeddings compare well with other word embeddings generated by unsupervised methods. Across tasks such as word similarity, entailment, sense induction, and contextual interpretation, Word2Sense is competitive with the state-of-the-art method for that task. Word2Sense embeddings are at least as sparse and fast to compute as prior art.
Anthology ID:
P19-1570
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Anna Korhonen, David Traum, Lluís Màrquez
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5692–5705
Language:
URL:
https://aclanthology.org/P19-1570
DOI:
10.18653/v1/P19-1570
Bibkey:
Cite (ACL):
Abhishek Panigrahi, Harsha Vardhan Simhadri, and Chiranjib Bhattacharyya. 2019. Word2Sense: Sparse Interpretable Word Embeddings. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5692–5705, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Word2Sense: Sparse Interpretable Word Embeddings (Panigrahi et al., ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-1570.pdf
Video:
 https://aclanthology.org/P19-1570.mp4