Layer-wise Model Pruning based on Mutual Information

Chun Fan, Jiwei Li, Tianwei Zhang, Xiang Ao, Fei Wu, Yuxian Meng, Xiaofei Sun


Abstract
Inspired by mutual information (MI) based feature selection in SVMs and logistic regression, in this paper, we propose MI-based layer-wise pruning: for each layer of a multi-layer neural network, neurons with higher values of MI with respect to preserved neurons in the upper layer are preserved. Starting from the top softmax layer, layer-wise pruning proceeds in a top-down fashion until reaching the bottom word embedding layer. The proposed pruning strategy offers merits over weight-based pruning techniques: (1) it avoids irregular memory access since representations and matrices can be squeezed into their smaller but dense counterparts, leading to greater speedup; (2) in a manner of top-down pruning, the proposed method operates from a more global perspective based on training signals in the top layer, and prunes each layer by propagating the effect of global signals through layers, leading to better performances at the same sparsity level. Extensive experiments show that at the same sparsity level, the proposed strategy offers both greater speedup and higher performances than weight-based pruning methods (e.g., magnitude pruning, movement pruning).
Anthology ID:
2021.emnlp-main.246
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3079–3090
Language:
URL:
https://aclanthology.org/2021.emnlp-main.246
DOI:
10.18653/v1/2021.emnlp-main.246
Bibkey:
Cite (ACL):
Chun Fan, Jiwei Li, Tianwei Zhang, Xiang Ao, Fei Wu, Yuxian Meng, and Xiaofei Sun. 2021. Layer-wise Model Pruning based on Mutual Information. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3079–3090, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Layer-wise Model Pruning based on Mutual Information (Fan et al., EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.246.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.246.mp4
Data
MultiNLISQuADSSTSST-5