Identifying Automatically Generated Headlines using Transformers

Antonis Maronikolakis, Hinrich Schütze, Mark Stevenson


Abstract
False information spread via the internet and social media influences public opinion and user activity, while generative models enable fake content to be generated faster and more cheaply than had previously been possible. In the not so distant future, identifying fake content generated by deep learning models will play a key role in protecting users from misinformation. To this end, a dataset containing human and computer-generated headlines was created and a user study indicated that humans were only able to identify the fake headlines in 47.8% of the cases. However, the most accurate automatic approach, transformers, achieved an overall accuracy of 85.7%, indicating that content generated from language models can be filtered out accurately.
Anthology ID:
2021.nlp4if-1.1
Volume:
Proceedings of the Fourth Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda
Month:
June
Year:
2021
Address:
Online
Editors:
Anna Feldman, Giovanni Da San Martino, Chris Leberknight, Preslav Nakov
Venue:
NLP4IF
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–6
Language:
URL:
https://aclanthology.org/2021.nlp4if-1.1
DOI:
10.18653/v1/2021.nlp4if-1.1
Bibkey:
Cite (ACL):
Antonis Maronikolakis, Hinrich Schütze, and Mark Stevenson. 2021. Identifying Automatically Generated Headlines using Transformers. In Proceedings of the Fourth Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda, pages 1–6, Online. Association for Computational Linguistics.
Cite (Informal):
Identifying Automatically Generated Headlines using Transformers (Maronikolakis et al., NLP4IF 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.nlp4if-1.1.pdf