FLOP-Efficient Training: Early Stopping Based on Test-Time Compute Awareness

Hossam Amer; Maryam Dialameh; Hossein Rajabzadeh; Walid Ahmed; Weiwei Zhang; Yang Liu

FLOP-Efficient Training: Early Stopping Based on Test-Time Compute Awareness

Hossam Amer, Maryam Dialameh, Hossein Rajabzadeh, Walid Ahmed, Weiwei Zhang, Yang Liu

Abstract

Scaling training compute, measured in FLOPs, has long been shown to improve the accuracy of large language models, yet training remains resource-intensive. Prior work shows that increasing test-time compute (TTC)—for example through iterative sampling—can allow smaller models to rival or surpass much larger ones at lower overall cost. We introduce TTC-aware training, where an intermediate checkpoint and a corresponding TTC configuration can together match or exceed the accuracy of a fully trained model while requiring substantially fewer training FLOPs. Building on this insight, we propose an early stopping algorithm that jointly selects a checkpoint and TTC configuration to minimize training compute without sacrificing accuracy. To make this practical, we develop an efficient TTC evaluation method that avoids exhaustive search, and we formalize a break-even bound that identifies when increased inference compute compensates for reduced training compute. Experiments demonstrate up to 92% reductions in training FLOPs while maintaining and sometimes remarkably improving accuracy. These results highlight a new perspective for balancing training and inference compute in model development, enabling faster deployment cycles and more frequent model refreshes.

Anthology ID:: 2026.findings-acl.1766
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 35431–35449
Language:
URL:: https://aclanthology.org/2026.findings-acl.1766/
DOI:
Bibkey:
Cite (ACL):: Hossam Amer, Maryam Dialameh, Hossein Rajabzadeh, Walid Ahmed, Weiwei Zhang, and Yang Liu. 2026. FLOP-Efficient Training: Early Stopping Based on Test-Time Compute Awareness. In Findings of the Association for Computational Linguistics: ACL 2026, pages 35431–35449, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: FLOP-Efficient Training: Early Stopping Based on Test-Time Compute Awareness (Amer et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.1766.pdf
Checklist:: 2026.findings-acl.1766.checklist.pdf

PDF Cite Search Checklist Fix data