Developing a New Classifier for Automated Identification of Incivility in Social Media

Sam Davidson, Qiusi Sun, Magdalena Wojcieszak


Abstract
Incivility is not only prevalent on online social media platforms, but also has concrete effects on individual users, online groups, and the platforms themselves. Given the prevalence and effects of online incivility, and the challenges involved in human-based incivility detection, it is urgent to develop validated and versatile automatic approaches to identifying uncivil posts and comments. This project advances both a neural, BERT-based classifier as well as a logistic regression classifier to identify uncivil comments. The classifier is trained on a dataset of Reddit posts, which are annotated for incivility, and further expanded using a combination of labeled data from Reddit and Twitter. Our best performing model achieves an F1 of 0.802 on our Reddit test set. The final model is not only applicable across social media platforms and their distinct data structures, but also computationally versatile, and - as such - ready to be used on vast volumes of online data. All trained models and annotated data are made available to the research community.
Anthology ID:
2020.alw-1.12
Volume:
Proceedings of the Fourth Workshop on Online Abuse and Harms
Month:
November
Year:
2020
Address:
Online
Venues:
ALW | EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
95–101
Language:
URL:
https://aclanthology.org/2020.alw-1.12
DOI:
10.18653/v1/2020.alw-1.12
Bibkey:
Cite (ACL):
Sam Davidson, Qiusi Sun, and Magdalena Wojcieszak. 2020. Developing a New Classifier for Automated Identification of Incivility in Social Media. In Proceedings of the Fourth Workshop on Online Abuse and Harms, pages 95–101, Online. Association for Computational Linguistics.
Cite (Informal):
Developing a New Classifier for Automated Identification of Incivility in Social Media (Davidson et al., ALW 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.alw-1.12.pdf
Video:
 https://slideslive.com/38939531
Data
Hate Speech and Offensive Language