Hyperspherical Query Likelihood Models with Word Embeddings

Ryo Masumura, Taichi Asami, Hirokazu Masataki, Kugatsu Sadamitsu, Kyosuke Nishida, Ryuichiro Higashinaka


Abstract
This paper presents an initial study on hyperspherical query likelihood models (QLMs) for information retrieval (IR). Our motivation is to naturally utilize pre-trained word embeddings for probabilistic IR. To this end, key idea is to directly leverage the word embeddings as random variables for directional probabilistic models based on von Mises-Fisher distributions which are familiar to cosine distances. The proposed method enables us to theoretically take semantic similarities between document and target queries into consideration without introducing heuristic expansion techniques. In addition, this paper reveals relationships between hyperspherical QLMs and conventional QLMs. Experiments show document retrieval evaluation results in which a hyperspherical QLM is compared to conventional QLMs and document distance metrics using word or document embeddings.
Anthology ID:
I17-2036
Volume:
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Month:
November
Year:
2017
Address:
Taipei, Taiwan
Editors:
Greg Kondrak, Taro Watanabe
Venue:
IJCNLP
SIG:
Publisher:
Asian Federation of Natural Language Processing
Note:
Pages:
210–216
Language:
URL:
https://aclanthology.org/I17-2036
DOI:
Bibkey:
Cite (ACL):
Ryo Masumura, Taichi Asami, Hirokazu Masataki, Kugatsu Sadamitsu, Kyosuke Nishida, and Ryuichiro Higashinaka. 2017. Hyperspherical Query Likelihood Models with Word Embeddings. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 210–216, Taipei, Taiwan. Asian Federation of Natural Language Processing.
Cite (Informal):
Hyperspherical Query Likelihood Models with Word Embeddings (Masumura et al., IJCNLP 2017)
Copy Citation:
PDF:
https://aclanthology.org/I17-2036.pdf