UEGP: Unified Expert-Guided Pre-training for Knowledge Rekindle

Yutao Mou, Kexiang Wang, Jianhe Lin, Dehong Ma, Jun Fan, Daiting Shi, Zhicong Cheng, Gu Simiu, Dawei Yin, Weiran Xu


Abstract
Pre-training and fine-tuning framework has become the standard training paradigm for NLP tasks and is also widely used in industrial-level applications. However, there are still a limitation with this paradigm: simply fine-tuning with task-specific objectives tends to converge to local minima, resulting in a sub-optimal performance. In this paper, we first propose a new paradigm: knowledge rekindle, which aims to re-incorporate the fine-tuned expert model into the training cycle and break through the performance upper bounds of experts without introducing additional annotated data. Then we further propose a unified expert-guided pre-training (UEGP) framework for knowledge rekindle. Specifically, we reuse fine-tuned expert models for various downstream tasks as knowledge sources and inject task-specific prior knowledge to pre-trained language models (PLMs) by means of knowledge distillation. In this process, we perform multi-task learning with knowledge distillation and masked language modeling (MLM) objectives. We also further explored whether mixture-of-expert guided pre-training (MoEGP) can further enhance the effect of knowledge rekindle. Experiments and analysis on eight datasets in GLUE benchmark and a industrial-level search re-ranking dataset show the effectiveness of our method.
Anthology ID:
2024.findings-naacl.170
Volume:
Findings of the Association for Computational Linguistics: NAACL 2024
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2661–2673
Language:
URL:
https://aclanthology.org/2024.findings-naacl.170
DOI:
Bibkey:
Cite (ACL):
Yutao Mou, Kexiang Wang, Jianhe Lin, Dehong Ma, Jun Fan, Daiting Shi, Zhicong Cheng, Gu Simiu, Dawei Yin, and Weiran Xu. 2024. UEGP: Unified Expert-Guided Pre-training for Knowledge Rekindle. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 2661–2673, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
UEGP: Unified Expert-Guided Pre-training for Knowledge Rekindle (Mou et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-naacl.170.pdf
Copyright:
 2024.findings-naacl.170.copyright.pdf