General-to-Specific Transfer Labeling for Domain Adaptable Keyphrase Generation

Rui Meng, Tong Wang, Xingdi Yuan, Yingbo Zhou, Daqing He


Abstract
Training keyphrase generation (KPG) models require a large amount of annotated data, which can be prohibitively expensive and often limited to specific domains. In this study, we first demonstrate that large distribution shifts among different domains severely hinder the transferability of KPG models. We then propose a three-stage pipeline, which gradually guides KPG models’ learning focus from general syntactical features to domain-related semantics, in a data-efficient manner. With domain-general phrase pre-training, we pre-train Sequence-to-Sequence models with generic phrase annotations that are widely available on the web, which enables the models to generate phrases in a wide range of domains. The resulting model is then applied in the Transfer Labeling stage to produce domain-specific pseudo keyphrases, which help adapt models to a new domain. Finally, we fine-tune the model with limited data with true labels to fully adapt it to the target domain. Our experiment results show that the proposed process can produce good quality keyphrases in new domains and achieve consistent improvements after adaptation with limited in-domain annotated data. All code and datasets are available at https://github.com/memray/OpenNMT-kpg-release.
Anthology ID:
2023.findings-acl.102
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1602–1618
Language:
URL:
https://aclanthology.org/2023.findings-acl.102
DOI:
10.18653/v1/2023.findings-acl.102
Bibkey:
Cite (ACL):
Rui Meng, Tong Wang, Xingdi Yuan, Yingbo Zhou, and Daqing He. 2023. General-to-Specific Transfer Labeling for Domain Adaptable Keyphrase Generation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 1602–1618, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
General-to-Specific Transfer Labeling for Domain Adaptable Keyphrase Generation (Meng et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-acl.102.pdf