COVID-19 and Misinformation: A Large-Scale Lexical Analysis on Twitter

Dimosthenis Antypas; Jose Camacho-Collados; Alun Preece; David Rogers

doi:10.18653/v1/2021.acl-srw.13

COVID-19 and Misinformation: A Large-Scale Lexical Analysis on Twitter

Dimosthenis Antypas, Jose Camacho-Collados, Alun Preece, David Rogers

Abstract

Social media is often used by individuals and organisations as a platform to spread misinformation. With the recent coronavirus pandemic we have seen a surge of misinformation on Twitter, posing a danger to public health. In this paper, we compile a large COVID-19 Twitter misinformation corpus and perform an analysis to discover patterns with respect to vocabulary usage. Among others, our analysis reveals that the variety of topics and vocabulary usage are considerably more limited and negative in tweets related to misinformation than in randomly extracted tweets. In addition to our qualitative analysis, our experimental results show that a simple linear model based only on lexical features is effective in identifying misinformation-related tweets (with accuracy over 80%), providing evidence to the fact that the vocabulary used in misinformation largely differs from generic tweets.

Anthology ID:: 2021.acl-srw.13
Volume:: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop
Month:: August
Year:: 2021
Address:: Online
Editors:: Jad Kabbara, Haitao Lin, Amandalynne Paullada, Jannis Vamvas
Venues:: ACL | IJCNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 119–126
Language:
URL:: https://aclanthology.org/2021.acl-srw.13/
DOI:: 10.18653/v1/2021.acl-srw.13
Bibkey:
Cite (ACL):: Dimosthenis Antypas, Jose Camacho-Collados, Alun Preece, and David Rogers. 2021. COVID-19 and Misinformation: A Large-Scale Lexical Analysis on Twitter. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop, pages 119–126, Online. Association for Computational Linguistics.
Cite (Informal):: COVID-19 and Misinformation: A Large-Scale Lexical Analysis on Twitter (Antypas et al., ACL-IJCNLP 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.acl-srw.13.pdf
Video:: https://aclanthology.org/2021.acl-srw.13.mp4

PDF Cite Search Video Fix data