Ernesto L. Estevanell-Valladares


2021

pdf bib
Knowledge Discovery in COVID-19 Research Literature
Ernesto L. Estevanell-Valladares | Suilan Estevez-Velarde | Alejandro Piad-Morffis | Yoan Gutierrez | Andres Montoyo | Rafael Muñoz | Yudivián Almeida Cruz
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

This paper presents the preliminary results of an ongoing project that analyzes the growing body of scientific research published around the COVID-19 pandemic. In this research, a general-purpose semantic model is used to double annotate a batch of 500 sentences that were manually selected from the CORD-19 corpus. Afterwards, a baseline text-mining pipeline is designed and evaluated via a large batch of 100,959 sentences. We present a qualitative analysis of the most interesting facts automatically extracted and highlight possible future lines of development. The preliminary results show that general-purpose semantic models are a useful tool for discovering fine-grained knowledge in large corpora of scientific documents.