Avijeet Shil
2025
ESAQueryRank: Ranking Query Interpretations for Document Retrieval Using Explicit Semantic Analysis
Avijeet Shil | Wei Jin
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Avijeet Shil | Wei Jin
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Representing query translation into relevant entities is a critical component of an information retrieval system. This paper proposes an unsupervised framework, ESAQueryRank, designed to process natural language queries by mapping n-gram phrases to Wikipedia titles and ranking potential entity and phrase combinations using Explicit Semantic Analysis. Unlike previous approaches, this framework does not rely on query expansion, syntactic parsing, or manual annotation. Instead, it leverages Wikipedia metadata—such as titles, redirects, disambiguation pages to disambiguate entities and identify the most relevant ones based on cosine similarity in the ESA space. ESAQueryRank is evaluated using a random set of TREC questions and compared against a keyword-based approach and a context-based question translation model (CBQT). In all comparisons of full category types, ESAQueryRank consistently shows better results against both methods. Notably, the framework excels with more complex queries, achieving improvements in Mean Reciprocal Rank (MRR) of up to 480% for intricate queries like those beginning with “Why,” even without explicitly incorporating the question type. These results demonstrate that ESAQueryRank is an effective, transparent, and domain-independent framework for building natural language interfaces.