Evons: A Dataset for Fake and Real News Virality Analysis and Prediction

Kriste Krstovski, Angela Soomin Ryu, Bruce Kogut


Abstract
We present a novel collection of news articles originating from fake and real news media sources for the analysis and prediction of news virality. Unlike existing fake news datasets which either contain claims, or news article headline and body, in this collection each article is supported with a Facebook engagement count which we consider as an indicator of the article virality. In addition we also provide the article description and thumbnail image with which the article was shared on Facebook. These images were automatically annotated with object tags and color attributes. Using cloud based vision analysis tools, thumbnail images were also analyzed for faces and detected faces were annotated with facial attributes. We empirically investigate the use of this collection on an example task of article virality prediction.
Anthology ID:
2022.coling-1.317
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
3589–3596
Language:
URL:
https://aclanthology.org/2022.coling-1.317
DOI:
Bibkey:
Cite (ACL):
Kriste Krstovski, Angela Soomin Ryu, and Bruce Kogut. 2022. Evons: A Dataset for Fake and Real News Virality Analysis and Prediction. In Proceedings of the 29th International Conference on Computational Linguistics, pages 3589–3596, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
Evons: A Dataset for Fake and Real News Virality Analysis and Prediction (Krstovski et al., COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.317.pdf
Code
 krstovski/evons
Data
RealNews