Corpus Development for Studying Online Disinformation Campaign: A Narrative + Stance Approach

Mack Blackburn, Ning Yu, John Berrie, Brian Gordon, David Longfellow, William Tirrell, Mark Williams


Abstract
Disinformation on social media is impacting our personal life and society. The outbreak of the new coronavirus is the most recent example for which a wealth of disinformation provoked fear, hate, and even social panic. While there are emerging interests in studying how disinformation campaigns form, spread, and influence target audiences, developing disinformation campaign corpora is challenging given the high volume, fast evolution, and wide variation of messages associated with each campaign. Disinformation cannot always be captured by simple factchecking, which makes it even more challenging to validate and create ground truth. This paper presents our approach to develop a corpus for studying disinformation campaigns targeting the White Helmets of Syria. We bypass directly classifying a piece of information as disinformation or not. Instead, we label the narrative and stance of tweets and YouTube comments about White Helmets. Narratives is defined as a recurring statement that is used to express a point of view. Stance is a high-level point of view on a topic. We demonstrate that narrative and stance together can provide a dynamic method for real world users, e.g., intelligence analysts, to quickly identify and counter disinformation campaigns based on their knowledge at the time.
Anthology ID:
2020.stoc-1.7
Volume:
Proceedings for the First International Workshop on Social Threats in Online Conversations: Understanding and Management
Month:
May
Year:
2020
Address:
Marseille, France
Venues:
LREC | STOC | WS
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
41–47
Language:
English
URL:
https://aclanthology.org/2020.stoc-1.7
DOI:
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2020.stoc-1.7.pdf