Solve-Detect-Verify: Inference-Time Scaling with Flexible Generative Verifier

Jianyuan Zhong; Zeju Li; Zhijian Xu; Xiangyu Wen; Kezhi Li; Qiang Xu

Solve-Detect-Verify: Inference-Time Scaling with Flexible Generative Verifier

Jianyuan Zhong, Zeju Li, Zhijian Xu, Xiangyu Wen, Kezhi Li, Qiang Xu

Abstract

Complex reasoning with Large Language Models (LLMs) demands a careful balance between accuracy and computational cost. Verification is crucial for reliability but faces trade-off: robust process-based verifiers are computationally prohibitive, while fast verifiers lack precision. We introduce flexive, a unified generative verifier designed to navigate this trade-off by dynamically allocating compute between rapid fast thinking and deliberative slow thinking. A key innovation is our training strategy: we use Group Relative Policy Optimization (GRPO) to specifically enhance the reliability of the fast mode. This targeted training generalizes effectively, elevating the slow mode to state-of-the-art open-source performance. To deploy flexive, we propose the solve-detect-verify (SDV) pipeline. Moving beyond static Best-of-N ranking, SDV employs an iterative refinement process that utilizes likelihood-based probing to detect solution completion, curtailing overthinking, and leverages flexive’s feedback for targeted correction. Solve-detect-verify establishes a new open-source state-of-the-art on ProcessBench, outperforming GenPRM-32B while requiring ~2.3x fewer TFLOPS and 15x less training data. On AIME 2024, the full SDV pipeline achieves 83.3% accuracy, surpassing strong baselines while using significantly fewer tokens.

Anthology ID:: 2026.acl-long.2190
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 47377–47401
Language:
URL:: https://aclanthology.org/2026.acl-long.2190/
DOI:
Bibkey:
Cite (ACL):: Jianyuan Zhong, Zeju Li, Zhijian Xu, Xiangyu Wen, Kezhi Li, and Qiang Xu. 2026. Solve-Detect-Verify: Inference-Time Scaling with Flexible Generative Verifier. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 47377–47401, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Solve-Detect-Verify: Inference-Time Scaling with Flexible Generative Verifier (Zhong et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.2190.pdf
Checklist:: 2026.acl-long.2190.checklist.pdf

PDF Cite Search Checklist Fix data