A Generalized Approach to Protest Event Detection in German Local News

Gregor Wiedemann, Jan Matti Dollbaum, Sebastian Haunss, Priska Daphi, Larissa Daria Meier


Abstract
Protest events provide information about social and political conflicts, the state of social cohesion and democratic conflict management, as well as the state of civil society in general. Social scientists are therefore interested in the systematic observation of protest events. With this paper, we release the first German language resource of protest event related article excerpts published in local news outlets. We use this dataset to train and evaluate transformer-based text classifiers to automatically detect relevant newspaper articles. Our best approach reaches a binary F1-score of 93.3 %, which is a promising result for our goal to support political science research. However, in a second experiment, we show that our model does not generalize equally well when applied to data from time periods and localities other than our training sample. To make protest event detection more robust, we test two ways of alternative preprocessing. First, we find that letting the classifier concentrate on sentences around protest keywords improves the F1-score for out-of-sample data up to +4 percentage points. Second, against our initial intuition, masking of named entities during preprocessing does not improve the generalization in terms of F1-scores. However, it leads to a significantly improved recall of the models.
Anthology ID:
2022.lrec-1.413
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
3883–3891
Language:
URL:
https://aclanthology.org/2022.lrec-1.413
DOI:
Bibkey:
Cite (ACL):
Gregor Wiedemann, Jan Matti Dollbaum, Sebastian Haunss, Priska Daphi, and Larissa Daria Meier. 2022. A Generalized Approach to Protest Event Detection in German Local News. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3883–3891, Marseille, France. European Language Resources Association.
Cite (Informal):
A Generalized Approach to Protest Event Detection in German Local News (Wiedemann et al., LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.413.pdf