Team “NoConflict” at CASE 2021 Task 1: Pretraining for Sentence-Level Protest Event Detection

Tiancheng Hu, Niklas Stoehr


Abstract
An ever-increasing amount of text, in the form of social media posts and news articles, gives rise to new challenges and opportunities for the automatic extraction of socio-political events. In this paper, we present our submission to the Shared Tasks on Socio-Political and Crisis Events Detection, Task 1, Multilingual Protest News Detection, Subtask 2, Event Sentence Classification, of CASE @ ACL-IJCNLP 2021. In our submission, we utilize the RoBERTa model with additional pretraining, and achieve the best F1 score of 0.8532 in event sentence classification in English and the second-best F1 score of 0.8700 in Portuguese via simple translation. We analyze the failure cases of our model. We also conduct an ablation study to show the effect of choosing the right pretrained language model, adding additional training data and data augmentation.
Anthology ID:
2021.case-1.20
Volume:
Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021)
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | CASE | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
152–160
Language:
URL:
https://aclanthology.org/2021.case-1.20
DOI:
10.18653/v1/2021.case-1.20
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.case-1.20.pdf