GRASP: Replace Redundant Layers with Adaptive Singular Parameters for Efficient Model Compression

Kainan Liu; Yong Zhang; Ning Cheng; Zhitao Li; Shaojun Wang; Jing Xiao

doi:10.18653/v1/2025.emnlp-main.1338

GRASP: Replace Redundant Layers with Adaptive Singular Parameters for Efficient Model Compression

Kainan Liu, Yong Zhang, Ning Cheng, Zhitao Li, Shaojun Wang, Jing Xiao

Abstract

Recent studies have demonstrated that many layers are functionally redundant in large language models (LLMs), enabling model compression by removing these layers to reduce inference cost. While such approaches can improve efficiency, indiscriminate layer pruning often results in significant performance degradation. In this paper, we propose **GRASP** (**G**radient-based **R**etention of **A**daptive **S**ingular **P**arameters), a novel compression framework that mitigates this issue by preserving sensitivity-aware singular values. Unlike direct layer pruning, GRASP leverages gradient-based attribution on a small calibration dataset to adaptively identify and retain critical singular components. By replacing redundant layers with only a minimal set of parameters, GRASP achieves efficient compression while maintaining strong performance with minimal overhead. Experiments across multiple LLMs show that GRASP consistently outperforms existing compression methods, achieving 90% of the original model’s performance under a 20% compression ratio.

Anthology ID:: 2025.emnlp-main.1338
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 26333–26348
Language:
URL:: https://aclanthology.org/2025.emnlp-main.1338/
DOI:: 10.18653/v1/2025.emnlp-main.1338
Bibkey:
Cite (ACL):: Kainan Liu, Yong Zhang, Ning Cheng, Zhitao Li, Shaojun Wang, and Jing Xiao. 2025. GRASP: Replace Redundant Layers with Adaptive Singular Parameters for Efficient Model Compression. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 26333–26348, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: GRASP: Replace Redundant Layers with Adaptive Singular Parameters for Efficient Model Compression (Liu et al., EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-main.1338.pdf
Checklist:: 2025.emnlp-main.1338.checklist.pdf

PDF Cite Search Checklist Fix data