Bigfoot in Big Tech: Detecting Out of Domain Conspiracy Theories

Matthew Fort, Zuoyu Tian, Elizabeth Gabel, Nina Georgiades, Noah Sauer, Daniel Dakota, Sandra Kübler


Abstract
We investigate approaches to classifying texts into either conspiracy theory or mainstream using the Language Of Conspiracy (LOCO) corpus. Since conspiracy theories are not monolithic constructs, we need to identify approaches that robustly work in an out-of- domain setting (i.e., across conspiracy topics). We investigate whether optimal in-domain set- tings can be transferred to out-of-domain set- tings, and we investigate different methods for bleaching to steer classifiers away from words typical for an individual conspiracy theory. We find that BART works better than an SVM, that we can successfully classify out-of-domain, but there are no clear trends in how to choose the best source training domains. Addition- ally, bleaching only topic words works better than bleaching all content words or completely delexicalizing texts.
Anthology ID:
2023.ranlp-1.40
Volume:
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing
Month:
September
Year:
2023
Address:
Varna, Bulgaria
Editors:
Ruslan Mitkov, Galia Angelova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
353–363
Language:
URL:
https://aclanthology.org/2023.ranlp-1.40
DOI:
Bibkey:
Cite (ACL):
Matthew Fort, Zuoyu Tian, Elizabeth Gabel, Nina Georgiades, Noah Sauer, Daniel Dakota, and Sandra Kübler. 2023. Bigfoot in Big Tech: Detecting Out of Domain Conspiracy Theories. In Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, pages 353–363, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Bigfoot in Big Tech: Detecting Out of Domain Conspiracy Theories (Fort et al., RANLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.ranlp-1.40.pdf