BiKT: Enabling Bidirectional Knowledge Transfer Between Pretrained Models and Sequential Downstream Tasks

Hang Zeng; Chaoyue Niu; Fan Wu (吴凡, 吴钒); Shaojie Tang; Leihao Pei; Chengfei Lv; Guihai Chen

doi:10.18653/v1/2024.findings-emnlp.179

BiKT: Enabling Bidirectional Knowledge Transfer Between Pretrained Models and Sequential Downstream Tasks

Hang Zeng, Chaoyue Niu, Fan Wu, Shaojie Tang, Leihao Pei, Chengfei Lv, Guihai Chen

Abstract

Adapting pretrained models to downstream tasks is important in practical applications. Existing frameworks adapt from an initial pretrained model to each downstream task directly, but ignore the sequential nature of the downstream tasks and their feedback effect on the pretrained model. In this work, we propose a new framework, called BiKT, to enable bidirectional knowledge transfer between pretrained models and downstream tasks in rounds. We model each downstream task in the current round as a target task for adaptation and treat all the tasks in the previous rounds as source tasks for feedback. We design a feedback algorithm by multi-task learning over the labeled data of the source tasks, where task-specific prompts are plugged into the backbone network for decoupling task-exclusive knowledge from task-shared knowledge. We further utilize the good initiation of the new backbone network updated in the feedback phase and the trained prompts of the source tasks for adaptation. Evaluation over 9 GLUE datasets, 6 SuperGLUE datasets, and 8 other datasets using models with different pretraining levels and different parameter scales shows remarkable improvement in full-shot and few-shot adaptation settings.

Anthology ID:: 2024.findings-emnlp.179
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2024
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3156–3171
Language:
URL:: https://aclanthology.org/2024.findings-emnlp.179/
DOI:: 10.18653/v1/2024.findings-emnlp.179
Bibkey:
Cite (ACL):: Hang Zeng, Chaoyue Niu, Fan Wu, Shaojie Tang, Leihao Pei, Chengfei Lv, and Guihai Chen. 2024. BiKT: Enabling Bidirectional Knowledge Transfer Between Pretrained Models and Sequential Downstream Tasks. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 3156–3171, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: BiKT: Enabling Bidirectional Knowledge Transfer Between Pretrained Models and Sequential Downstream Tasks (Zeng et al., Findings 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.findings-emnlp.179.pdf
Software:: 2024.findings-emnlp.179.software.zip

PDF Cite Search Software Fix data