Seer Self-Consistency: Advance Budget Estimation for Adaptive Test-Time Scaling

Shiyu Ji; Yixuan Wang; Yijun Liu; Qingfu Zhu; Wanxiang Che (车万翔)

Seer Self-Consistency: Advance Budget Estimation for Adaptive Test-Time Scaling

Shiyu Ji, Yixuan Wang, Yijun Liu, Qingfu Zhu, Wanxiang Che

Abstract

Test-time scaling improves the inference performance of Large Language Models (LLMs) but also incurs substantial computational costs. Although recent studies have reduced token consumption through dynamic self-consistency, they remain constrained by the high latency of sequential requests. In this paper, we propose SeerSC, a dynamic self-consistency framework that simultaneously improves token efficiency and latency by integrating System 1 and System 2 reasoning. Specifically, we utilize the rapid System 1 to compute the answer entropy for given queries. This score is then used to evaluate the potential of samples for scaling, enabling dynamic self-consistency under System 2. Benefiting from the advance and accurate estimation provided by System 1, the proposed method can reduce token usage while simultaneously achieving a significant decrease in latency through parallel generation. It outperforms existing methods, achieving up to a 47% reduction in token consumption and a 43% reduction in inference latency without significant performance loss.

Anthology ID:: 2026.findings-acl.2120
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 42734–42747
Language:
URL:: https://aclanthology.org/2026.findings-acl.2120/
DOI:
Bibkey:
Cite (ACL):: Shiyu Ji, Yixuan Wang, Yijun Liu, Qingfu Zhu, and Wanxiang Che. 2026. Seer Self-Consistency: Advance Budget Estimation for Adaptive Test-Time Scaling. In Findings of the Association for Computational Linguistics: ACL 2026, pages 42734–42747, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Seer Self-Consistency: Advance Budget Estimation for Adaptive Test-Time Scaling (Ji et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.2120.pdf
Checklist:: 2026.findings-acl.2120.checklist.pdf

PDF Cite Search Checklist Fix data