KPDROP: Improving Absent Keyphrase Generation

Jishnu Ray Chowdhury, Seo Yeon Park, Tuhin Kundu, Cornelia Caragea


Abstract
Keyphrase generation is the task of generating phrases (keyphrases) that summarize the main topics of a given document. Keyphrases can be either present or absent from the given document. While the extraction of present keyphrases has received much attention in the past, only recently a stronger focus has been placed on the generation of absent keyphrases. However, generating absent keyphrases is challenging; even the best methods show only a modest degree of success. In this paper, we propose a model-agnostic approach called keyphrase dropout (or KPDrop) to improve absent keyphrase generation. In this approach, we randomly drop present keyphrases from the document and turn them into artificial absent keyphrases during training. We test our approach extensively and show that it consistently improves the absent performance of strong baselines in both supervised and resource-constrained semi-supervised settings.
Anthology ID:
2022.findings-emnlp.357
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2022
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4853–4870
Language:
URL:
https://aclanthology.org/2022.findings-emnlp.357
DOI:
10.18653/v1/2022.findings-emnlp.357
Bibkey:
Cite (ACL):
Jishnu Ray Chowdhury, Seo Yeon Park, Tuhin Kundu, and Cornelia Caragea. 2022. KPDROP: Improving Absent Keyphrase Generation. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 4853–4870, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
KPDROP: Improving Absent Keyphrase Generation (Ray Chowdhury et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-emnlp.357.pdf