Understanding BERT performance in propaganda analysis

Yiqing Hua


Abstract
In this paper, we describe our system used in the shared task for fine-grained propaganda analysis at sentence level. Despite the challenging nature of the task, our pretrained BERT model (team YMJA) fine tuned on the training dataset provided by the shared task scored 0.62 F1 on the test set and ranked third among 25 teams who participated in the contest. We present a set of illustrative experiments to better understand the performance of our BERT model on this shared task. Further, we explore beyond the given dataset for false-positive cases that likely to be produced by our system. We show that despite the high performance on the given testset, our system may have the tendency of classifying opinion pieces as propaganda and cannot distinguish quotations of propaganda speech from actual usage of propaganda techniques.
Anthology ID:
D19-5019
Volume:
Proceedings of the Second Workshop on Natural Language Processing for Internet Freedom: Censorship, Disinformation, and Propaganda
Month:
November
Year:
2019
Address:
Hong Kong, China
Editors:
Anna Feldman, Giovanni Da San Martino, Alberto Barrón-Cedeño, Chris Brew, Chris Leberknight, Preslav Nakov
Venue:
NLP4IF
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
135–138
Language:
URL:
https://aclanthology.org/D19-5019
DOI:
10.18653/v1/D19-5019
Bibkey:
Cite (ACL):
Yiqing Hua. 2019. Understanding BERT performance in propaganda analysis. In Proceedings of the Second Workshop on Natural Language Processing for Internet Freedom: Censorship, Disinformation, and Propaganda, pages 135–138, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
Understanding BERT performance in propaganda analysis (Hua, NLP4IF 2019)
Copy Citation:
PDF:
https://aclanthology.org/D19-5019.pdf