Building Content-driven Entity Networks for Scarce Scientific Literature using Content Information

Reinald Kim Amplayo, Min Song


Abstract
This paper proposes several network construction methods for collections of scarce scientific literature data. We define scarcity as lacking in value and in volume. Instead of using the paper’s metadata to construct several kinds of scientific networks, we use the full texts of the articles and automatically extract the entities needed to construct the networks. Specifically, we present seven kinds of networks using the proposed construction methods: co-occurrence networks for author, keyword, and biological entities, and citation networks for author, keyword, biological, and topic entities. We show two case studies that applies our proposed methods: CADASIL, a rare yet the most common form of hereditary stroke disorder, and Metformin, the first-line medication to the type 2 diabetes treatment. We apply our proposed method to four different applications for evaluation: finding prolific authors, finding important bio-entities, finding meaningful keywords, and discovering influential topics. The results show that the co-occurrence and citation networks constructed using the proposed method outperforms the traditional-based networks. We also compare our proposed networks to traditional citation networks constructed using enough data and infer that even with the same amount of enough data, our methods perform comparably or better than the traditional methods.
Anthology ID:
W16-5103
Volume:
Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM2016)
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Sophia Ananiadou, Riza Batista-Navarro, Kevin Bretonnel Cohen, Dina Demner-Fushman, Paul Thompson
Venue:
WS
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
20–29
Language:
URL:
https://aclanthology.org/W16-5103
DOI:
Bibkey:
Cite (ACL):
Reinald Kim Amplayo and Min Song. 2016. Building Content-driven Entity Networks for Scarce Scientific Literature using Content Information. In Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM2016), pages 20–29, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Building Content-driven Entity Networks for Scarce Scientific Literature using Content Information (Amplayo & Song, 2016)
Copy Citation:
PDF:
https://aclanthology.org/W16-5103.pdf