Q-PRM: Adaptive Query Rewriting for Retrieval-Augmented Generation via Step-level Process Supervision

Xiaopeng Ye, Chen Xu, Chaoliang Zhang, Zhaocheng Du, Jun Xu, Gang Wang, Zhenhua Dong


Abstract
Query rewriting plays a pivotal role in Retrieval-Augmented Generation (RAG) by refining real-world queries of varying complexity. Existing approaches typically rely on outcome-supervised training or heuristic rules to guide the rewriting process. However, these paradigms often struggle to handle queries with varying levels of complexity, posing over- and under-refinement problems. We identify the root cause of these issues as the absence of supervision signals for intermediate steps. To fully construct and utilize such signals, we propose Q-PRM, a novel query rewriting framework. Q-PRM reformulates the rewriting process as a Markov Decision Process (MDP) composed of atomic rewriting steps. In this way, Q-PRM can apply process-level supervision to each atomic step according to the query type, offering more targeted and effective guidance. Q-PRM comprises three key stages: (1) applying Monte Carlo Tree Search to generate step-level process supervision signals; (2) performing reinforced self-training for progressive process refinement; and (3) employing PRM-guided decoding during inference. Experiments on several open-domain QA benchmarks demonstrate that Q-PRM consistently outperforms baselines across different levels of query complexity.
Anthology ID:
2025.findings-emnlp.817
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
15113–15128
Language:
URL:
https://aclanthology.org/2025.findings-emnlp.817/
DOI:
Bibkey:
Cite (ACL):
Xiaopeng Ye, Chen Xu, Chaoliang Zhang, Zhaocheng Du, Jun Xu, Gang Wang, and Zhenhua Dong. 2025. Q-PRM: Adaptive Query Rewriting for Retrieval-Augmented Generation via Step-level Process Supervision. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 15113–15128, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Q-PRM: Adaptive Query Rewriting for Retrieval-Augmented Generation via Step-level Process Supervision (Ye et al., Findings 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.findings-emnlp.817.pdf
Checklist:
 2025.findings-emnlp.817.checklist.pdf