CORA: A Deep Active Learning Covid-19 Relevancy Algorithm to Identify Core Scientific Articles

Zubair Afzal, Vikrant Yadav, Olga Fedorova, Vaishnavi Kandala, Janneke van de Loo, Saber A. Akhondi, Pascal Coupet, George Tsatsaronis


Abstract
Ever since the COVID-19 pandemic broke out, the academic and scientific research community, as well as industry and governments around the world have joined forces in an unprecedented manner to fight the threat. Clinicians, biologists, chemists, bioinformaticians, nurses, data scientists, and all of the affiliated relevant disciplines have been mobilized to help discover efficient treatments for the infected population, as well as a vaccine solution to prevent further the virus spread. In this combat against the virus responsible for the pandemic, key for any advancements is the timely, accurate, peer-reviewed, and efficient communication of any novel research findings. In this paper we present a novel framework to address the information need of filtering efficiently the scientific bibliography for relevant literature around COVID-19. The contributions of the paper are summarized in the following: we define and describe the information need that encompasses the major requirements for COVID-19 articles relevancy, we present and release an expert-curated benchmark set for the task, and we analyze the performance of several state-of-the-art machine learning classifiers that may distinguish the relevant from the non-relevant COVID-19 literature.
Anthology ID:
2020.nlpcovid19-2.2
Volume:
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020
Month:
December
Year:
2020
Address:
Online
Editors:
Karin Verspoor, Kevin Bretonnel Cohen, Michael Conway, Berry de Bruijn, Mark Dredze, Rada Mihalcea, Byron Wallace
Venue:
NLP-COVID19
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
Language:
URL:
https://aclanthology.org/2020.nlpcovid19-2.2
DOI:
10.18653/v1/2020.nlpcovid19-2.2
Bibkey:
Cite (ACL):
Zubair Afzal, Vikrant Yadav, Olga Fedorova, Vaishnavi Kandala, Janneke van de Loo, Saber A. Akhondi, Pascal Coupet, and George Tsatsaronis. 2020. CORA: A Deep Active Learning Covid-19 Relevancy Algorithm to Identify Core Scientific Articles. In Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020, Online. Association for Computational Linguistics.
Cite (Informal):
CORA: A Deep Active Learning Covid-19 Relevancy Algorithm to Identify Core Scientific Articles (Afzal et al., NLP-COVID19 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.nlpcovid19-2.2.pdf
Video:
 https://slideslive.com/38939846