Word centrality constrained representation for keyphrase extraction

Zelalem Gero, Joyce Ho


Abstract
To keep pace with the increased generation and digitization of documents, automated methods that can improve search, discovery and mining of the vast body of literature are essential. Keyphrases provide a concise representation by identifying salient concepts in a document. Various supervised approaches model keyphrase extraction using local context to predict the label for each token and perform much better than the unsupervised counterparts. Unfortunately, this method fails for short documents where the context is unclear. Moreover, keyphrases, which are usually the gist of a document, need to be the central theme. We propose a new extraction model that introduces a centrality constraint to enrich the word representation of a Bidirectional long short-term memory. Performance evaluation on 2 publicly available datasets demonstrate our model outperforms existing state-of-the art approaches.
Anthology ID:
2021.bionlp-1.17
Volume:
Proceedings of the 20th Workshop on Biomedical Language Processing
Month:
June
Year:
2021
Address:
Online
Venues:
BioNLP | NAACL
SIG:
SIGBIOMED
Publisher:
Association for Computational Linguistics
Note:
Pages:
155–161
Language:
URL:
https://aclanthology.org/2021.bionlp-1.17
DOI:
10.18653/v1/2021.bionlp-1.17
Bibkey:
Cite (ACL):
Zelalem Gero and Joyce Ho. 2021. Word centrality constrained representation for keyphrase extraction. In Proceedings of the 20th Workshop on Biomedical Language Processing, pages 155–161, Online. Association for Computational Linguistics.
Cite (Informal):
Word centrality constrained representation for keyphrase extraction (Gero & Ho, BioNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.bionlp-1.17.pdf
Code
 zhgero/keyphrases_centrality