CORAL: Learning Consistent Representations across Multi-step Training with Lighter Speculative Drafter

Yepeng Weng; Dianwen Mei; Huishi Qiu; Xujie Chen; Li Liu; Jiang Tian; Zhongchao Shi

doi:10.18653/v1/2025.acl-long.278

CORAL: Learning Consistent Representations across Multi-step Training with Lighter Speculative Drafter

Yepeng Weng, Dianwen Mei, Huishi Qiu, Xujie Chen, Li Liu, Jiang Tian, Zhongchao Shi

Abstract

Speculative decoding is a powerful technique that accelerates Large Language Model (LLM) inference by leveraging a lightweight speculative draft model. However, existing designs suffers in performance due to misalignment between training and inference. Recent methods have tried to solve this issue by adopting a multi-step training strategy, but the complex inputs of different training steps make it harder for the draft model to converge. To address this, we propose CORAL, a novel framework that improves both accuracy and efficiency in speculative drafting. CORAL introduces Cross-Step Representation Alignment, a method that enhances consistency across multiple training steps, significantly improving speculative drafting performance. Additionally, we identify the LM head as a major bottleneck in the inference speed of the draft model. We introduce a weight-grouping mechanism that selectively activates a subset of LM head parameters during inference, substantially reducing the latency of the draft model. We evaluate CORAL on three LLM families and three benchmark datasets, achieving speedup ratios of 2.50x-4.07x, outperforming state-of-the-art methods such as EAGLE-2 and HASS. Our results demonstrate that CORAL effectively mitigates training-inference misalignment and delivers significant speedup for modern LLMs with large vocabularies.

Anthology ID:: 2025.acl-long.278
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5580–5593
Language:
URL:: https://aclanthology.org/2025.acl-long.278/
DOI:: 10.18653/v1/2025.acl-long.278
Bibkey:
Cite (ACL):: Yepeng Weng, Dianwen Mei, Huishi Qiu, Xujie Chen, Li Liu, Jiang Tian, and Zhongchao Shi. 2025. CORAL: Learning Consistent Representations across Multi-step Training with Lighter Speculative Drafter. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5580–5593, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: CORAL: Learning Consistent Representations across Multi-step Training with Lighter Speculative Drafter (Weng et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-long.278.pdf

PDF Cite Search Fix data