Pruning Pre-trained Language Models with Principled Importance and Self-regularization

Siyu Ren; Kenny Zhu

doi:10.18653/v1/2023.findings-acl.573

Pruning Pre-trained Language Models with Principled Importance and Self-regularization

Abstract

Iterative pruning is one of the most effective compression methods for pre-trained language models. We discovered that finding the optimal pruning decision is an equality-constrained 0-1 Integer Linear Programming problem. The solution to this optimization problem leads to a principled importance criterion which we use to rank parameters during iterative model pruning. To mitigate the poor generalization at high sparsity levels, we propose a self-regularization scheme where model prediction is regularized by the latest checkpoint with increasing sparsity throughout pruning. Our experiments on natural language understanding, question answering, named entity recognition, and data-to-text generation with various Transformer-based PLMs show the effectiveness of the approach at various sparsity levels.

Anthology ID:: 2023.findings-acl.573
Volume:: Findings of the Association for Computational Linguistics: ACL 2023
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8995–9008
Language:
URL:: https://aclanthology.org/2023.findings-acl.573
DOI:: 10.18653/v1/2023.findings-acl.573
Bibkey:
Cite (ACL):: Siyu Ren and Kenny Zhu. 2023. Pruning Pre-trained Language Models with Principled Importance and Self-regularization. In Findings of the Association for Computational Linguistics: ACL 2023, pages 8995–9008, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Pruning Pre-trained Language Models with Principled Importance and Self-regularization (Ren & Zhu, Findings 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.findings-acl.573.pdf

PDF Cite Search