OptiPrune: Effective Pruning Approach for Every Target Sparsity

Khang Nguyen Le, Ryo Sato, Dai Nakashima, Takeshi Suzuki, Minh Le Nguyen


Abstract
Large language models (LLMs) have achieved notable success across various tasks but are hindered by their large size and high computational demands. Post-training pruning (PTP) offers a promising solution by reducing model size through parameter removal while preserving performance. However, current PTP methods perform optimally only within specific sparsity ranges. This paper presents two key findings: (1) Layerwise uniform sparsity is effective at low sparsity, while non-uniform sparsity excels at high levels; (2) Relative importance-based pruning works best at low sparsity, whereas Hessian-based weight reconstruction is superior at high sparsity. We design and conduct experiments to validate these findings. Based on these insights, we introduce OptiPrune, a robust pruning method effective across all sparsity levels. OptiPrune adapts non-uniform sparsity with adaptive deviation and employs a threshold to select the optimal pruning strategy. Empirical results across diverse datasets, architectures, and languages validate its performance and robustness. These findings provide valuable directions for future LLM pruning research. Our code and data are publicly available.
Anthology ID:
2025.coling-main.243
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3600–3612
Language:
URL:
https://aclanthology.org/2025.coling-main.243/
DOI:
Bibkey:
Cite (ACL):
Khang Nguyen Le, Ryo Sato, Dai Nakashima, Takeshi Suzuki, and Minh Le Nguyen. 2025. OptiPrune: Effective Pruning Approach for Every Target Sparsity. In Proceedings of the 31st International Conference on Computational Linguistics, pages 3600–3612, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
OptiPrune: Effective Pruning Approach for Every Target Sparsity (Le et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.243.pdf