CMTA: COVID-19 Misinformation Multilingual Analysis on Twitter

Raj Pranesh, Mehrdad Farokhenajd, Ambesh Shekhar, Genoveva Vargas-Solar


Abstract
The internet has actually come to be an essential resource of health knowledge for individuals around the world in the present situation of the coronavirus condition pandemic(COVID-19). During pandemic situations, myths, sensationalism, rumours and misinformation, generated intentionally or unintentionally, spread rapidly through social networks. Twitter is one of these popular social networks people use to share COVID-19 related news, information, and thoughts that reflect their perception and opinion about the pandemic. Evaluation of tweets for recognizing misinformation can create beneficial understanding to review the top quality and also the readability of online information concerning the COVID-19. This paper presents a multilingual COVID-19 related tweet analysis method, CMTA, that uses BERT, a deep learning model for multilingual tweet misinformation detection and classification. CMTA extracts features from multilingual textual data, which is then categorized into specific information classes. Classification is done by a Dense-CNN model trained on tweets manually annotated into information classes (i.e., ‘false’, ‘partly false’, ‘misleading’). The paper presents an analysis of multilingual tweets from February to June, showing the distribution type of information spread across different languages. To access the performance of the CMTA multilingual model, we performed a comparative analysis of 8 monolingual model and CMTA for the misinformation detection task. The results show that our proposed CMTA model has surpassed various monolingual models which consolidated the fact that through transfer learning a multilingual framework could be developed.
Anthology ID:
2021.acl-srw.28
Volume:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
270–283
Language:
URL:
https://aclanthology.org/2021.acl-srw.28
DOI:
10.18653/v1/2021.acl-srw.28
Bibkey:
Cite (ACL):
Raj Pranesh, Mehrdad Farokhenajd, Ambesh Shekhar, and Genoveva Vargas-Solar. 2021. CMTA: COVID-19 Misinformation Multilingual Analysis on Twitter. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop, pages 270–283, Online. Association for Computational Linguistics.
Cite (Informal):
CMTA: COVID-19 Misinformation Multilingual Analysis on Twitter (Pranesh et al., ACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.acl-srw.28.pdf
Video:
 https://aclanthology.org/2021.acl-srw.28.mp4