Improving Absent Keyphrase Generation with Diversity Heads

Edwin Thomas, Sowmya Vajjala


Abstract
Keyphrase Generation (KPG) is the task of automatically generating appropriate keyphrases for a given text, with a wide range of real-world applications such as document indexing and tagging, information retrieval, and text summarization. NLP research makes a distinction between present and absent keyphrases based on whether a keyphrase is directly present as a sequence of words in the document during evaluation. However, present and absent keyphrases are treated together in a text-to-text generation framework during training. We treat present keyphrase extraction as a sequence labeling problem and propose a new absent keyphrase generation model that uses a modified cross-attention layer with additional heads to capture diverse views for the same context encoding in this paper. Our experiments show improvements over the state-of-the-art for four datasets for present keyphrase extraction and five datasets for absent keyphrase generation among the six English datasets we explored, covering long and short documents.
Anthology ID:
2024.findings-naacl.102
Volume:
Findings of the Association for Computational Linguistics: NAACL 2024
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1568–1584
Language:
URL:
https://aclanthology.org/2024.findings-naacl.102
DOI:
Bibkey:
Cite (ACL):
Edwin Thomas and Sowmya Vajjala. 2024. Improving Absent Keyphrase Generation with Diversity Heads. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 1568–1584, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Improving Absent Keyphrase Generation with Diversity Heads (Thomas & Vajjala, Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-naacl.102.pdf
Copyright:
 2024.findings-naacl.102.copyright.pdf