BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving

Ran Xin; Chenguang Xi; Jie Yang; Feng Chen; Hang Wu; Xia Xiao; Yifan Sun; Shen Zheng; Ming Ding

doi:10.18653/v1/2025.acl-long.1565

BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving

Ran Xin, Chenguang Xi, Jie Yang, Feng Chen, Hang Wu, Xia Xiao, Yifan Sun, Shen Zheng, Ming Ding

Abstract

Recent advancements in large language models (LLMs) have spurred growing interest in automatic theorem proving using Lean4, where effective tree search methods are crucial for navigating the underlying large proof search spaces. While the existing approaches primarily rely on value functions and/or Monte Carlo Tree Search (MCTS), the potential of simpler methods like Best-First Tree Search (BFS) remains underexplored. In this paper, we investigate whether BFS can achieve competitive performance in large-scale theorem proving tasks. We present BFS-Prover, a scalable expert iteration framework, featuring three key innovations. First, we implement strategic data filtering at each expert iteration round, excluding problems solvable via beam search node expansion to focus on harder cases. Second, we improve the sample efficiency of BFS through Direct Preference Optimization (DPO) applied to state-tactic pairs automatically annotated with compiler error feedback, refining the LLM’s policy to prioritize productive expansions. Third, we employ length normalization in BFS to encourage exploration of deeper proof paths. BFS-Prover achieves a state-of-the-art score of 72.95 on the MiniF2F test set and therefore challenges the perceived necessity of complex tree search methods, demonstrating that BFS can achieve competitive performance when properly scaled.

Anthology ID:: 2025.acl-long.1565
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 32588–32599
Language:
URL:: https://aclanthology.org/2025.acl-long.1565/
DOI:: 10.18653/v1/2025.acl-long.1565
Bibkey:
Cite (ACL):: Ran Xin, Chenguang Xi, Jie Yang, Feng Chen, Hang Wu, Xia Xiao, Yifan Sun, Shen Zheng, and Ming Ding. 2025. BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 32588–32599, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving (Xin et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-long.1565.pdf

PDF Cite Search Fix data