On Modeling Sense Relatedness in Multi-prototype Word Embedding

Yixin Cao; Jiaxin Shi; Juanzi Li; Zhiyuan Liu; Chengjiang Li

On Modeling Sense Relatedness in Multi-prototype Word Embedding

Yixin Cao, Jiaxin Shi, Juanzi Li, Zhiyuan Liu, Chengjiang Li

Abstract

To enhance the expression ability of distributional word representation learning model, many researchers tend to induce word senses through clustering, and learn multiple embedding vectors for each word, namely multi-prototype word embedding model. However, most related work ignores the relatedness among word senses which actually plays an important role. In this paper, we propose a novel approach to capture word sense relatedness in multi-prototype word embedding model. Particularly, we differentiate the original sense and extended senses of a word by introducing their global occurrence information and model their relatedness through the local textual context information. Based on the idea of fuzzy clustering, we introduce a random process to integrate these two types of senses and design two non-parametric methods for word sense induction. To make our model more scalable and efficient, we use an online joint learning framework extended from the Skip-gram model. The experimental results demonstrate that our model outperforms both conventional single-prototype embedding models and other multi-prototype embedding models, and achieves more stable performance when trained on smaller data.

Anthology ID:: I17-1024
Volume:: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:: November
Year:: 2017
Address:: Taipei, Taiwan
Editors:: Greg Kondrak, Taro Watanabe
Venue:: IJCNLP
SIG:
Publisher:: Asian Federation of Natural Language Processing
Note:
Pages:: 233–242
Language:
URL:: https://aclanthology.org/I17-1024/
DOI:
Bibkey:
Cite (ACL):: Yixin Cao, Jiaxin Shi, Juanzi Li, Zhiyuan Liu, and Chengjiang Li. 2017. On Modeling Sense Relatedness in Multi-prototype Word Embedding. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 233–242, Taipei, Taiwan. Asian Federation of Natural Language Processing.
Cite (Informal):: On Modeling Sense Relatedness in Multi-prototype Word Embedding (Cao et al., IJCNLP 2017)
Copy Citation:
PDF:: https://aclanthology.org/I17-1024.pdf

PDF Cite Search Fix data