Ramya Tekumalla
2020
Characterizing drug mentions in COVID-19 Twitter Chatter
Ramya Tekumalla
|
Juan M Banda
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020
Since the classification of COVID-19 as a global pandemic, there have been many attempts to treat and contain the virus. Although there is no specific antiviral treatment recommended for COVID-19, there are several drugs that can potentially help with symptoms. In this work, we mined a large twitter dataset of 424 million tweets of COVID-19 chatter to identify discourse around drug mentions. While seemingly a straightforward task, due to the informal nature of language use in Twitter, we demonstrate the need of machine learning alongside traditional automated methods to aid in this task. By applying these complementary methods, we are able to recover almost 15% additional data, making misspelling handling a needed task as a pre-processing step when dealing with social media data.
Search