Neural Embedding Language Models in Semantic Clustering of Web Search Results

Andrey Kutuzov, Elizaveta Kuzmenko


Abstract
In this paper, a new approach towards semantic clustering of the results of ambiguous search queries is presented. We propose using distributed vector representations of words trained with the help of prediction-based neural embedding models to detect senses of search queries and to cluster search engine results page according to these senses. The words from titles and snippets together with semantic relationships between them form a graph, which is further partitioned into components related to different query senses. This approach to search engine results clustering is evaluated against a new manually annotated evaluation data set of Russian search queries. We show that in the task of semantically clustering search results, prediction-based models slightly but stably outperform traditional count-based ones, with the same training corpora.
Anthology ID:
L16-1486
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3044–3048
Language:
URL:
https://aclanthology.org/L16-1486
DOI:
Bibkey:
Cite (ACL):
Andrey Kutuzov and Elizaveta Kuzmenko. 2016. Neural Embedding Language Models in Semantic Clustering of Web Search Results. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 3044–3048, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Neural Embedding Language Models in Semantic Clustering of Web Search Results (Kutuzov & Kuzmenko, LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1486.pdf