LightFormer: Light-weight Transformer Using SVD-based Weight Transfer and Parameter Sharing

Xiuqing Lv, Peng Zhang, Sunzhu Li, Guobing Gan, Yueheng Sun


Abstract
Transformer has become an important technique for natural language processing tasks with great success. However, it usually requires huge storage space and computational cost, making it difficult to be deployed on resource-constrained edge devices. To compress and accelerate Transformer, we propose LightFormer, which adopts a low-rank factorization initialized by SVD-based weight transfer and parameter sharing. The SVD-based weight transfer can effectively utilize the well-trained Transformer parameter knowledge to speed up the model convergence, and effectively alleviate the low-rank bottleneck problem combined with parameter sharing. We validate our method on machine translation, text summarization and text classification tasks. Experiments show that on IWSLT’14 De-En and WMT’14 En-De, LightFormer achieves similar performance to the baseline Transformer with 3.8 times and 1.8 times fewer parameters, and achieves 2.3 times speedup and 1.5 times speedup respectively, generally outperforming recent light-weight Transformers.
Anthology ID:
2023.findings-acl.656
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10323–10335
Language:
URL:
https://aclanthology.org/2023.findings-acl.656
DOI:
10.18653/v1/2023.findings-acl.656
Bibkey:
Cite (ACL):
Xiuqing Lv, Peng Zhang, Sunzhu Li, Guobing Gan, and Yueheng Sun. 2023. LightFormer: Light-weight Transformer Using SVD-based Weight Transfer and Parameter Sharing. In Findings of the Association for Computational Linguistics: ACL 2023, pages 10323–10335, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
LightFormer: Light-weight Transformer Using SVD-based Weight Transfer and Parameter Sharing (Lv et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-acl.656.pdf