COVID-19 Literature Topic-Based Search via Hierarchical NMF

Rachel Grotheer, Longxiu Huang, Yihuan Huang, Alona Kryshchenko, Oleksandr Kryshchenko, Pengyu Li, Xia Li, Elizaveta Rebrova, Kyung Ha, Deanna Needell


Abstract
A dataset of COVID-19-related scientific literature is compiled, combining the articles from several online libraries and selecting those with open access and full text available. Then, hierarchical nonnegative matrix factorization is used to organize literature related to the novel coronavirus into a tree structure that allows researchers to search for relevant literature based on detected topics. We discover eight major latent topics and 52 granular subtopics in the body of literature, related to vaccines, genetic structure and modeling of the disease and patient studies, as well as related diseases and virology. In order that our tool may help current researchers, an interactive website is created that organizes available literature using this hierarchical structure.
Anthology ID:
2020.nlpcovid19-2.4
Volume:
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020
Month:
December
Year:
2020
Address:
Online
Venue:
NLP-COVID19
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
Language:
URL:
https://aclanthology.org/2020.nlpcovid19-2.4
DOI:
10.18653/v1/2020.nlpcovid19-2.4
Bibkey:
Cite (ACL):
Rachel Grotheer, Longxiu Huang, Yihuan Huang, Alona Kryshchenko, Oleksandr Kryshchenko, Pengyu Li, Xia Li, Elizaveta Rebrova, Kyung Ha, and Deanna Needell. 2020. COVID-19 Literature Topic-Based Search via Hierarchical NMF. In Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020, Online. Association for Computational Linguistics.
Cite (Informal):
COVID-19 Literature Topic-Based Search via Hierarchical NMF (Grotheer et al., NLP-COVID19 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.nlpcovid19-2.4.pdf
Video:
 https://slideslive.com/38939856