PQR: Improving Dense Retrieval via Potential Query Modeling

Junfeng Kang; Rui Li; Qi Liu; Yanjiang Chen; Zheng Zhang; Junzhe Jiang; Heng Yu; Yu Su

doi:10.18653/v1/2025.acl-long.660

PQR: Improving Dense Retrieval via Potential Query Modeling

Junfeng Kang, Rui Li, Qi Liu, Yanjiang Chen, Zheng Zhang, Junzhe Jiang, Heng Yu, Yu Su

Abstract

Dense retrieval has now become the mainstream paradigm in information retrieval. The core idea of dense retrieval is to align document embeddings with their corresponding query embeddings by maximizing their dot product. The current training data is quite sparse, with each document typically associated with only one or a few labeled queries. However, a single document can be retrieved by multiple different queries. Aligning a document with just one or a limited number of labeled queries results in a loss of its semantic information. In this paper, we propose a training-free Potential Query Retrieval (PQR) framework to address this issue. Specifically, we use a Gaussian mixture distribution to model all potential queries for a document, aiming to capture its comprehensive semantic information. To obtain this distribution, we introduce three sampling strategies to sample a large number of potential queries for each document and encode them into a semantic space. Using these sampled queries, we employ the Expectation-Maximization algorithm to estimate parameters of the distribution. Finally, we also propose a method to calculate similarity scores between user queries and documents under the PQR framework. Extensive experiments demonstrate the effectiveness of the proposed method.

Anthology ID:: 2025.acl-long.660
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 13455–13469
Language:
URL:: https://aclanthology.org/2025.acl-long.660/
DOI:: 10.18653/v1/2025.acl-long.660
Bibkey:
Cite (ACL):: Junfeng Kang, Rui Li, Qi Liu, Yanjiang Chen, Zheng Zhang, Junzhe Jiang, Heng Yu, and Yu Su. 2025. PQR: Improving Dense Retrieval via Potential Query Modeling. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13455–13469, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: PQR: Improving Dense Retrieval via Potential Query Modeling (Kang et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-long.660.pdf

PDF Cite Search Fix data