Improved Topic Representations of Medical Documents to Assist COVID-19 Literature Exploration

Yulia Otmakhova, Karin Verspoor, Timothy Baldwin, Simon Šuster


Abstract
Efficient discovery and exploration of biomedical literature has grown in importance in the context of the COVID-19 pandemic, and topic-based methods such as latent Dirichlet allocation (LDA) are a useful tool for this purpose. In this study we compare traditional topic models based on word tokens with topic models based on medical concepts, and propose several ways to improve topic coherence and specificity.
Anthology ID:
2020.nlpcovid19-2.12
Volume:
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020
Month:
December
Year:
2020
Address:
Online
Editors:
Karin Verspoor, Kevin Bretonnel Cohen, Michael Conway, Berry de Bruijn, Mark Dredze, Rada Mihalcea, Byron Wallace
Venue:
NLP-COVID19
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
Language:
URL:
https://aclanthology.org/2020.nlpcovid19-2.12
DOI:
10.18653/v1/2020.nlpcovid19-2.12
Bibkey:
Cite (ACL):
Yulia Otmakhova, Karin Verspoor, Timothy Baldwin, and Simon Šuster. 2020. Improved Topic Representations of Medical Documents to Assist COVID-19 Literature Exploration. In Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020, Online. Association for Computational Linguistics.
Cite (Informal):
Improved Topic Representations of Medical Documents to Assist COVID-19 Literature Exploration (Otmakhova et al., NLP-COVID19 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.nlpcovid19-2.12.pdf
Video:
 https://slideslive.com/38939857
Data
CORD-19