Two Stage Transformer Model for COVID-19 Fake News Detection and Fact Checking

Rutvik Vijjali, Prathyush Potluri, Siddharth Kumar, Sundeep Teki


Abstract
The rapid advancement of technology in online communication via social media platforms has led to a prolific rise in the spread of misinformation and fake news. Fake news is especially rampant in the current COVID-19 pandemic, leading to people believing in false and potentially harmful claims and stories. Detecting fake news quickly can alleviate the spread of panic, chaos and potential health hazards. We developed a two stage automated pipeline for COVID-19 fake news detection using state of the art machine learning models for natural language processing. The first model leverages a novel fact checking algorithm that retrieves the most relevant facts concerning user queries about particular COVID-19 claims. The second model verifies the level of “truth” in the queried claim by computing the textual entailment between the claim and the true facts retrieved from a manually curated COVID-19 dataset. The dataset is based on a publicly available knowledge source consisting of more than 5000 COVID-19 false claims and verified explanations, a subset of which was internally annotated and cross-validated to train and evaluate our models. We evaluate a series of models based on classical text-based features to more contextual Transformer based models and observe that a model pipeline based on BERT and ALBERT for the two stages respectively yields the best results.
Anthology ID:
2020.nlp4if-1.1
Volume:
Proceedings of the 3rd NLP4IF Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Giovanni Da San Martino, Chris Brew, Giovanni Luca Ciampaglia, Anna Feldman, Chris Leberknight, Preslav Nakov
Venue:
NLP4IF
SIG:
Publisher:
International Committee on Computational Linguistics (ICCL)
Note:
Pages:
1–10
Language:
URL:
https://aclanthology.org/2020.nlp4if-1.1
DOI:
Bibkey:
Cite (ACL):
Rutvik Vijjali, Prathyush Potluri, Siddharth Kumar, and Sundeep Teki. 2020. Two Stage Transformer Model for COVID-19 Fake News Detection and Fact Checking. In Proceedings of the 3rd NLP4IF Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda, pages 1–10, Barcelona, Spain (Online). International Committee on Computational Linguistics (ICCL).
Cite (Informal):
Two Stage Transformer Model for COVID-19 Fake News Detection and Fact Checking (Vijjali et al., NLP4IF 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.nlp4if-1.1.pdf
Code
 rutvikvijjali/COVID-19-Claims-Dataset