BREAKING! Presenting Fake News Corpus for Automated Fact Checking

Archita Pathak, Rohini Srihari


Abstract
Popular fake news articles spread faster than mainstream articles on the same topic which renders manual fact checking inefficient. At the same time, creating tools for automatic detection is as challenging due to lack of dataset containing articles which present fake or manipulated stories as compelling facts. In this paper, we introduce manually verified corpus of compelling fake and questionable news articles on the USA politics, containing around 700 articles from Aug-Nov, 2016. We present various analyses on this corpus and finally implement classification model based on linguistic features. This work is still in progress as we plan to extend the dataset in the future and use it for our approach towards automated fake news detection.
Anthology ID:
P19-2050
Original:
P19-2050v1
Version 2:
P19-2050v2
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Fernando Alva-Manchego, Eunsol Choi, Daniel Khashabi
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
357–362
Language:
URL:
https://aclanthology.org/P19-2050
DOI:
10.18653/v1/P19-2050
Bibkey:
Cite (ACL):
Archita Pathak and Rohini Srihari. 2019. BREAKING! Presenting Fake News Corpus for Automated Fact Checking. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pages 357–362, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
BREAKING! Presenting Fake News Corpus for Automated Fact Checking (Pathak & Srihari, ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-2050.pdf