Kriste Krstovski
2022
Evons: A Dataset for Fake and Real News Virality Analysis and Prediction
Kriste Krstovski | Angela Soomin Ryu | Bruce Kogut
Proceedings of the 29th International Conference on Computational Linguistics
Kriste Krstovski | Angela Soomin Ryu | Bruce Kogut
Proceedings of the 29th International Conference on Computational Linguistics
We present a novel collection of news articles originating from fake and real news media sources for the analysis and prediction of news virality. Unlike existing fake news datasets which either contain claims, or news article headline and body, in this collection each article is supported with a Facebook engagement count which we consider as an indicator of the article virality. In addition we also provide the article description and thumbnail image with which the article was shared on Facebook. These images were automatically annotated with object tags and color attributes. Using cloud based vision analysis tools, thumbnail images were also analyzed for faces and detected faces were annotated with facial attributes. We empirically investigate the use of this collection on an example task of article virality prediction.
2020
DeSePtion: Dual Sequence Prediction and Adversarial Examples for Improved Fact-Checking
Christopher Hidey | Tuhin Chakrabarty | Tariq Alhindi | Siddharth Varia | Kriste Krstovski | Mona Diab | Smaranda Muresan
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Christopher Hidey | Tuhin Chakrabarty | Tariq Alhindi | Siddharth Varia | Kriste Krstovski | Mona Diab | Smaranda Muresan
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
The increased focus on misinformation has spurred development of data and systems for detecting the veracity of a claim as well as retrieving authoritative evidence. The Fact Extraction and VERification (FEVER) dataset provides such a resource for evaluating endto- end fact-checking, requiring retrieval of evidence from Wikipedia to validate a veracity prediction. We show that current systems for FEVER are vulnerable to three categories of realistic challenges for fact-checking – multiple propositions, temporal reasoning, and ambiguity and lexical variation – and introduce a resource with these types of claims. Then we present a system designed to be resilient to these “attacks” using multiple pointer networks for document selection and jointly modeling a sequence of evidence sentences and veracity relation predictions. We find that in handling these attacks we obtain state-of-the-art results on FEVER, largely due to improved evidence retrieval.
2016
Online Multilingual Topic Models with Multi-Level Hyperpriors
Kriste Krstovski | David Smith | Michael J. Kurtz
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Kriste Krstovski | David Smith | Michael J. Kurtz
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Bootstrapping Translation Detection and Sentence Extraction from Comparable Corpora
Kriste Krstovski | David Smith
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Kriste Krstovski | David Smith
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
2013
Online Polylingual Topic Models for Fast Document Translation Detection
Kriste Krstovski | David A. Smith
Proceedings of the Eighth Workshop on Statistical Machine Translation
Kriste Krstovski | David A. Smith
Proceedings of the Eighth Workshop on Statistical Machine Translation
2011
A Minimally Supervised Approach for Detecting and Ranking Document Translation Pairs
Kriste Krstovski | David A. Smith
Proceedings of the Sixth Workshop on Statistical Machine Translation
Kriste Krstovski | David A. Smith
Proceedings of the Sixth Workshop on Statistical Machine Translation