SPaCe: Unlocking Sample-Efficient Large Language Models Training With Self-Pace Curriculum Learning

Van Dai Do; Manh Nguyen; Svetha Venkatesh; Hung Le

SPaCe: Unlocking Sample-Efficient Large Language Models Training With Self-Pace Curriculum Learning

Van Dai Do, Manh Nguyen, Svetha Venkatesh, Hung Le

Abstract

Large language models (LLMs) have shown strong reasoning capabilities when fine-tuned with reinforcement learning (RL). However, such methods require extensive data and compute, making them impractical under many realistic training budgets. Many existing pipelines sample training examples uniformly across steps or epochs, ignoring differences in difficulty, redundancy, and learning value, which slows learning and wastes computation. We propose SPaCe, a self-paced learning framework that enables efficient learning based on the capability of the model being trained through optimizing which data to use and when. First, we apply cluster-based data reduction to partition training data by semantics and difficulty, extracting a compact yet diverse subset that reduces redundancy. Then, a multi-armed bandit treats data clusters as arms, allocating training samples based on the model’s solve rates and learning progress. Experiments across multiple reasoning benchmarks show that SPaCe achieves comparable or better accuracy than state-of-the-art baselines while using up to (100 times) fewer samples. Ablation studies and analyses further highlight the importance of both data clustering and adaptive selection. Our results demonstrate that carefully curated, performance-driven training curricula can unlock strong reasoning abilities in LLMs with minimal resources.

Anthology ID:: 2026.findings-acl.171
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3480–3507
Language:
URL:: https://aclanthology.org/2026.findings-acl.171/
DOI:
Bibkey:
Cite (ACL):: Van Dai Do, Manh Nguyen, Svetha Venkatesh, and Hung Le. 2026. SPaCe: Unlocking Sample-Efficient Large Language Models Training With Self-Pace Curriculum Learning. In Findings of the Association for Computational Linguistics: ACL 2026, pages 3480–3507, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: SPaCe: Unlocking Sample-Efficient Large Language Models Training With Self-Pace Curriculum Learning (Do et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.171.pdf
Checklist:: 2026.findings-acl.171.checklist.pdf

PDF Cite Search Checklist Fix data