Exploring the Impact of Model Scaling on Parameter-Efficient Tuning

Yusheng Su; Chi-Min Chan; Jiali Cheng; Yujia Qin; Yankai Lin; Shengding Hu; Zonghan Yang; Ning Ding; Xingzhi Sun; Guotong Xie; Zhiyuan Liu; Maosong Sun

doi:10.18653/v1/2023.emnlp-main.931

Exploring the Impact of Model Scaling on Parameter-Efficient Tuning

Yusheng Su, Chi-Min Chan, Jiali Cheng, Yujia Qin, Yankai Lin, Shengding Hu, Zonghan Yang, Ning Ding, Xingzhi Sun, Guotong Xie, Zhiyuan Liu, Maosong Sun

Abstract

Parameter-efficient tuning (PET) methods can effectively drive extremely large pre-trained language models (PLMs) by training only minimal parameters. Different PET methods utilize different manually designed tunable modules. In small PLMs, there are usually noticeable performance differences among PET methods. Nevertheless, as the model scale increases, the performance differences become marginal. Hence, we hypothesize that model scaling mitigates the impact of design differences on PET methods. To investigate this hypothesis, we introduce a more flexible PET method called Arbitrary PET (APET) method. The APET method is compatible with a tunable module, which consists of any number of parameters distributed in arbitrary positions. Then, we utilize it and conduct experiments on 11 NLP tasks across 3 representative PLMs. Our investigations reveal that model scaling (1) mitigates the effects of the positions of tunable parameters on performance, and (2) enables tuning methods to achieve performance comparable to full-parameter fine-tuning by optimizing fewer tunable parameters. Intriguingly, we also observe that tuning methods optimize the similar number of tunable parameters to exceed random guess performance on different tasks. We collectively discuss this phenomenon and the two aforementioned findings from an optimization perspective to understand the underlying mechanisms. These conclusions enhance our understanding of the impact of model scaling on PET and assist in designing more effective and efficient PET methods for PLMs of different scales. The source code can be obtained from this GitHub repository: https://github.com/yushengsu-thu/PET_Scaling.

Anthology ID:: 2023.emnlp-main.931
Volume:: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 15062–15078
Language:
URL:: https://aclanthology.org/2023.emnlp-main.931
DOI:: 10.18653/v1/2023.emnlp-main.931
Bibkey:
Cite (ACL):: Yusheng Su, Chi-Min Chan, Jiali Cheng, Yujia Qin, Yankai Lin, Shengding Hu, Zonghan Yang, Ning Ding, Xingzhi Sun, Guotong Xie, Zhiyuan Liu, and Maosong Sun. 2023. Exploring the Impact of Model Scaling on Parameter-Efficient Tuning. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 15062–15078, Singapore. Association for Computational Linguistics.
Cite (Informal):: Exploring the Impact of Model Scaling on Parameter-Efficient Tuning (Su et al., EMNLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.emnlp-main.931.pdf
Video:: https://aclanthology.org/2023.emnlp-main.931.mp4

PDF Cite Search Video