Improving the Sample Efficiency of Prompt Tuning with Domain Adaptation
Xu Guo | Boyang Li | Han Yu
Findings of the Association for Computational Linguistics: EMNLP 2022
Prompt tuning, or the conditioning of a frozen pretrained language model (PLM) with soft prompts learned from data, has demonstrated impressive performance on a wide range of NLP tasks. However, prompt tuning requires a large training dataset to be effective and is outperformed by finetuning the entire PLM in data-scarce regimes. Previous work (Gu et al., 2022, Vu et al., 2022) proposed to transfer soft prompts pretrained on the source domain to the target domain. In this paper, we explore domain adaptation for prompt tuning, a problem setting where unlabeled data from the target domain are available during pretraining. We propose bOosting Prompt TunIng with doMain Adaptation (OPTIMA), which regularizes the decision boundary to be smooth around regions where source and target data distributions are similar. Extensive experiments demonstrate that OPTIMA significantly enhances the transferability and sample-efficiency of prompt tuning compared to strong baselines. Moreover, in few-shot settings, OPTIMA exceeds full-model tuning by a large margin.
基于多源知识融合的领域情感词典表示学习研究(Domain Sentiment Lexicon Representation Learning Based on Multi-source Knowledge Fusion)
Ruihua Qi (祁瑞华) | Jia Wei (魏佳) | Zhen Shao (邵震) | Xu Guo (郭旭) | Heng Chen (陈恒)
Proceedings of the 21st Chinese National Conference on Computational Linguistics
Latent-Optimized Adversarial Neural Transfer for Sarcasm Detection
Xu Guo | Boyang Li | Han Yu | Chunyan Miao
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
The existence of multiple datasets for sarcasm detection prompts us to apply transfer learning to exploit their commonality. The adversarial neural transfer (ANT) framework utilizes multiple loss terms that encourage the source-domain and the target-domain feature distributions to be similar while optimizing for domain-specific performance. However, these objectives may be in conflict, which can lead to optimization difficulties and sometimes diminished transfer. We propose a generalized latent optimization strategy that allows different losses to accommodate each other and improves training dynamics. The proposed method outperforms transfer learning and meta-learning baselines. In particular, we achieve 10.02% absolute performance gain over the previous state of the art on the iSarcasm dataset.
- Boyang Li 2
- Han Yu 2
- Ruihua Qi (祁瑞华) 1
- Jia Wei (魏佳) 1
- Zhen Shao (邵震) 1
- show all...