CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation

Ziyue Liu; Ruijie Zhang; Zhengyang Wang; Mingsong Yan; Zi Yang; Paul D. Hovland; Bogdan Nicolae; Franck Cappello; Sui Tang; Zheng Zhang

doi:10.18653/v1/2025.emnlp-main.230

CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation

Ziyue Liu, Ruijie Zhang, Zhengyang Wang, Mingsong Yan, Zi Yang, Paul D. Hovland, Bogdan Nicolae, Franck Cappello, Sui Tang, Zheng Zhang

Abstract

The full-size MLPs and the projection layers in attention introduce tremendous model sizes of large language models (LLMs), consuming extensive computational resources in pre-training. We empirically observe that the activations of pre-trained LLMs exhibit low-rank property. Motivated by such observations, we propose **CoLA** and its memory-efficient implementation, **CoLA-M**, to replace these full-size layers with compute-efficient **auto-encoders** that naturally enforce low-rank activations throughout training. This fundamental architectural change eliminates the activation redundancy and significantly boosts model capacity and training efficiency. Experiments on LLaMA models with 60 million to 7 billion parameters show that CoLA reduces the computing cost by 2\pmb{\times} and improves training throughput by 1.86\pmb{\times} while maintaining full-rank level performance. CoLA-M further squeezes memory cost without sacrificing throughput, offering a pre-training approach with collectively superior parameter, computing, and memory efficiency. The LLMs produced are also 2\pmb{\times} smaller, enabling faster inference with lower memory cost on resource-constrained platforms.

Anthology ID:: 2025.emnlp-main.230
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4627–4645
Language:
URL:: https://aclanthology.org/2025.emnlp-main.230/
DOI:: 10.18653/v1/2025.emnlp-main.230
Bibkey:
Cite (ACL):: Ziyue Liu, Ruijie Zhang, Zhengyang Wang, Mingsong Yan, Zi Yang, Paul D. Hovland, Bogdan Nicolae, Franck Cappello, Sui Tang, and Zheng Zhang. 2025. CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 4627–4645, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation (Liu et al., EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-main.230.pdf
Checklist:: 2025.emnlp-main.230.checklist.pdf

PDF Cite Search Checklist Fix data