1+1>2: A Synergistic Sparse and Low-Rank Compression Method for Large Language Models

Zeliang Zong, Kai Zhang, Zheyang Li, Wenming Tan, Ye Ren, Yiyan Zhai, Jilin Hu


Abstract
Large Language Models (LLMs) have demonstrated remarkable proficiency in language comprehension and generation; however, their widespread adoption is constrained by substantial bandwidth and computational demands. While pruning and low-rank approximation have each demonstrated promising performance individually, their synergy for LLMs remains underexplored. We introduce Synergistic Sparse and Low-Rank Compression (SSLC) methods for LLMs, which leverages the strengths of both techniques: low-rank approximation compresses the model by retaining its essential structure with minimal information loss, whereas sparse optimization eliminates non-essential weights, preserving those crucial for generalization. Based on theoretical analysis, we first formulate the joint low-rank approximation and sparse optimization as a unified problem and solve it by iterative optimization algorithm. Experiments on LLaMA and Qwen2.5 models (7B-70B) show that SSLC, without any additional training steps, consistently surpasses standalone methods, achieving state-of-the-arts results. Notably, SSLC compresses Qwen2.5 by 50% with no performance drop and achieves at least 1.63× speedup, offering a practical solution for efficient LLM deployment.
Anthology ID:
2025.findings-emnlp.765
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14206–14220
Language:
URL:
https://aclanthology.org/2025.findings-emnlp.765/
DOI:
Bibkey:
Cite (ACL):
Zeliang Zong, Kai Zhang, Zheyang Li, Wenming Tan, Ye Ren, Yiyan Zhai, and Jilin Hu. 2025. 1+1>2: A Synergistic Sparse and Low-Rank Compression Method for Large Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 14206–14220, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
1+1>2: A Synergistic Sparse and Low-Rank Compression Method for Large Language Models (Zong et al., Findings 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.findings-emnlp.765.pdf
Checklist:
 2025.findings-emnlp.765.checklist.pdf