GRAT: Guiding Retrieval-Augmented Reasoning through Process Rewards Tree Search

Xianshu Peng; Wei Wei

doi:10.18653/v1/2025.acl-long.1352

GRAT: Guiding Retrieval-Augmented Reasoning through Process Rewards Tree Search

Abstract

Enhancing large models for complex multi-hop question-answering has become a research focus in the Retrieval-augmented generation (RAG) area. Many existing approaches aim to mimic human thought processes by enabling large models to perform retrieval-augmented generation step by step. However, these methods can only perform single chain reasoning, which lacks the ability for multi-path exploration, strategic look-ahead, stepwise evaluation, and global selection. In addition, to effectively decompose complex problems, these methods can only rely on labor-intensive intermediate annotations for supervised fine-tuning. To address these issues, we propose GRAT, an algorithm guided by Monte Carlo Tree Search (MCTS) and process rewards. GRAT not only enables self-evaluation and self-correction but also assigns fine-grained rewards to each intermediate step in the search path. These fine-grained annotations can be used for model self-training, which enables GRAT to continuously self-update its problem analysis and reasoning capabilities. We conducted experiments on four multihop QA datasets: HotPotQA, 2WikiMultiHopQA, MuSiQue, and Bamboogle, demonstrating that GRAT outperforms various RAG-based methods. Additionally, incorporating self-training significantly enhances GRAT’s reasoning performance.

Anthology ID:: 2025.acl-long.1352
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 27861–27875
Language:
URL:: https://aclanthology.org/2025.acl-long.1352/
DOI:: 10.18653/v1/2025.acl-long.1352
Bibkey:
Cite (ACL):: Xianshu Peng and Wei Wei. 2025. GRAT: Guiding Retrieval-Augmented Reasoning through Process Rewards Tree Search. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 27861–27875, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: GRAT: Guiding Retrieval-Augmented Reasoning through Process Rewards Tree Search (Peng & Wei, ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-long.1352.pdf

PDF Cite Search Fix data