Faithful Question Answering with Monte-Carlo Planning

Ruixin Hong; Hongming Zhang; Hong Zhao; Dong Yu (于东); Changshui Zhang

doi:10.18653/v1/2023.acl-long.218

Faithful Question Answering with Monte-Carlo Planning

Ruixin Hong, Hongming Zhang, Hong Zhao, Dong Yu, Changshui Zhang

Abstract

Although large language models demonstrate remarkable question-answering performances, revealing the intermediate reasoning steps that the models faithfully follow remains challenging. In this paper, we propose FAME (FAithful question answering with MontE-carlo planning) to answer questions based on faithful reasoning steps. The reasoning steps are organized as a structured entailment tree, which shows how premises are used to produce intermediate conclusions that can prove the correctness of the answer. We formulate the task as a discrete decision-making problem and solve it through the interaction of a reasoning environment and a controller. The environment is modular and contains several basic task-oriented modules, while the controller proposes actions to assemble the modules. Since the search space could be large, we introduce a Monte-Carlo planning algorithm to do a look-ahead search and select actions that will eventually lead to high-quality steps. FAME achieves advanced performance on the standard benchmark. It can produce valid and faithful reasoning steps compared with large language models with a much smaller model size.

Anthology ID:: 2023.acl-long.218
Volume:: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3944–3965
Language:
URL:: https://aclanthology.org/2023.acl-long.218/
DOI:: 10.18653/v1/2023.acl-long.218
Bibkey:
Cite (ACL):: Ruixin Hong, Hongming Zhang, Hong Zhao, Dong Yu, and Changshui Zhang. 2023. Faithful Question Answering with Monte-Carlo Planning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3944–3965, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Faithful Question Answering with Monte-Carlo Planning (Hong et al., ACL 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.acl-long.218.pdf
Video:: https://aclanthology.org/2023.acl-long.218.mp4

PDF Cite Search Video Fix data