One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning

Guangtao Zeng; Peiyuan Zhang; Wei Lu

doi:10.18653/v1/2023.acl-long.418

One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning

Abstract

Fine-tuning pre-trained language models for multiple tasks can be expensive in terms of storage. Parameter-efficient transfer learning (PETL) methods have been proposed to address this issue, but they still require a significant number of parameters when being applied to broader ranges of tasks. To achieve even greater storage reduction, we propose ProPETL, a novel method that enables efficient sharing of a single prototype PETL network (e.g. adapter, LoRA, and prefix-tuning) across layers and tasks. We learn binary masks to select different sub-networks from the prototype network and apply them as PETL modules into different layers. We find that the binary masks can determine crucial structural information from the network, which is often ignored in previous studies. Our work can also be seen as a type of pruning method, where we find that overparameterization also exists in the seemingly small PETL modules. We evaluate ProPETL on various downstream tasks and show that it can outperform other PETL methods with around 10% parameters required by the latter.

Anthology ID:: 2023.acl-long.418
Volume:: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7564–7580
Language:
URL:: https://aclanthology.org/2023.acl-long.418/
DOI:: 10.18653/v1/2023.acl-long.418
Bibkey:
Cite (ACL):: Guangtao Zeng, Peiyuan Zhang, and Wei Lu. 2023. One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7564–7580, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning (Zeng et al., ACL 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.acl-long.418.pdf
Video:: https://aclanthology.org/2023.acl-long.418.mp4

PDF Cite Search Video Fix data