Iterative Structured Pruning for Large Language Models with Multi-Domain Calibration

Guangxin Wu; Hao Zhang; Zhang Zhibin; Jiafeng Guo (嘉丰 郭); Xueqi Cheng (程学旗)

Iterative Structured Pruning for Large Language Models with Multi-Domain Calibration

Guangxin Wu, Hao Zhang, Zhang Zhibin, Jiafeng Guo, Xueqi Cheng

Abstract

Large Language Models (LLMs) have achieved remarkable success across a wide spectrum of natural language processing tasks. However, their ever-growing scale introduces significant barriers to real-world deployment, including substantial computational overhead, memory footprint, and inference latency. While model pruning presents a viable solution to these challenges, existing unstructured pruning techniques often yield irregular sparsity patterns that necessitate specialized hardware or software support. In this work, we explore structured pruning, which eliminates entire architectural components and maintains compatibility with standard hardware accelerators. We introduce a novel structured pruning framework that leverages a hybrid multi-domain calibration set and an iterative calibration strategy to effectively identify and remove redundant channels. Extensive experiments on various models across diverse downstream tasks show that our approach achieves significant compression with minimal performance degradation.

Anthology ID:: 2026.eacl-industry.1
Volume:: Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Yevgen Matusevych, Gülşen Eryiğit, Nikolaos Aletras
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1–10
Language:
URL:: https://aclanthology.org/2026.eacl-industry.1/
DOI:
Bibkey:
Cite (ACL):: Guangxin Wu, Hao Zhang, Zhang Zhibin, Jiafeng Guo, and Xueqi Cheng. 2026. Iterative Structured Pruning for Large Language Models with Multi-Domain Calibration. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track), pages 1–10, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Iterative Structured Pruning for Large Language Models with Multi-Domain Calibration (Wu et al., EACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.eacl-industry.1.pdf

PDF Cite Search Fix data