Parameter-Efficient Tuning with Special Token Adaptation

Xiaocong Yang, James Y. Huang, Wenxuan Zhou, Muhao Chen


Abstract
Parameter-efficient tuning aims at updating only a small subset of parameters when adapting a pretrained model to downstream tasks. In this work, we introduce PASTA, in which we only modify the special token representations (e.g., [SEP] and [CLS] in BERT) before the self-attention module at each layer in Transformer-based models. PASTA achieves comparable performance to fine-tuning in natural language understanding tasks including text classification and NER with up to only 0.029% of total parameters trained. Our work not only provides a simple yet effective way of parameter-efficient tuning, which has a wide range of practical applications when deploying finetuned models for multiple tasks, but also demonstrates the pivotal role of special tokens in pretrained language models.
Anthology ID:
2023.eacl-main.60
Volume:
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
Month:
May
Year:
2023
Address:
Dubrovnik, Croatia
Editors:
Andreas Vlachos, Isabelle Augenstein
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
865–872
Language:
URL:
https://aclanthology.org/2023.eacl-main.60
DOI:
10.18653/v1/2023.eacl-main.60
Bibkey:
Cite (ACL):
Xiaocong Yang, James Y. Huang, Wenxuan Zhou, and Muhao Chen. 2023. Parameter-Efficient Tuning with Special Token Adaptation. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 865–872, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):
Parameter-Efficient Tuning with Special Token Adaptation (Yang et al., EACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.eacl-main.60.pdf
Video:
 https://aclanthology.org/2023.eacl-main.60.mp4