Avijeet Shil


2025

pdf bib
ESAQueryRank: Ranking Query Interpretations for Document Retrieval Using Explicit Semantic Analysis
Avijeet Shil | Wei Jin
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era

Representing query translation into relevant entities is a critical component of an infor- mation retrieval system. This paper proposes an unsupervised framework, ESAQueryRank, designed to process natural language queries by mapping n-gram phrases to Wikipedia ti- tles and ranking potential entity and phrase combinations using Explicit Semantic Analy- sis. Unlike previous approaches, this frame- work does not rely on query expansion, syn- tactic parsing, or manual annotation. Instead, it leverages Wikipedia metadata—such as ti- tles, redirects, disambiguation pages to dis- ambiguate entities and identify the most rel- evant ones based on cosine similarity in the ESA space. ESAQueryRank is evaluated using a random set of TREC questions and compared against a keyword-based approach and a context-based question translation model (CBQT). In all comparisons of full category types, ESAQueryRank consistently shows bet- ter results against both methods. Notably, the framework excels with more complex queries, achieving improvements in Mean Reciprocal Rank (MRR) of up to 480% for intricate queries like those beginning with “Why,” even without explicitly incorporating the question type. These results demonstrate that ESA- QueryRank is an effective, transparent, and domain-independent framework for building natural language interfaces.