Abusive Language on Social Media Through the Legal Looking Glass

Thales Bertaglia, Andreea Grigoriu, Michel Dumontier, Gijs van Dijck


Abstract
Abusive language is a growing phenomenon on social media platforms. Its effects can reach beyond the online context, contributing to mental or emotional stress on users. Automatic tools for detecting abuse can alleviate the issue. In practice, developing automated methods to detect abusive language relies on good quality data. However, there is currently a lack of standards for creating datasets in the field. These standards include definitions of what is considered abusive language, annotation guidelines and reporting on the process. This paper introduces an annotation framework inspired by legal concepts to define abusive language in the context of online harassment. The framework uses a 7-point Likert scale for labelling instead of class labels. We also present ALYT – a dataset of Abusive Language on YouTube. ALYT includes YouTube comments in English extracted from videos on different controversial topics and labelled by Law students. The comments were sampled from the actual collected data, without artificial methods for increasing the abusive content. The paper describes the annotation process thoroughly, including all its guidelines and training steps.
Anthology ID:
2021.woah-1.20
Volume:
Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)
Month:
August
Year:
2021
Address:
Online
Editors:
Aida Mostafazadeh Davani, Douwe Kiela, Mathias Lambert, Bertie Vidgen, Vinodkumar Prabhakaran, Zeerak Waseem
Venue:
WOAH
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
191–200
Language:
URL:
https://aclanthology.org/2021.woah-1.20
DOI:
10.18653/v1/2021.woah-1.20
Bibkey:
Cite (ACL):
Thales Bertaglia, Andreea Grigoriu, Michel Dumontier, and Gijs van Dijck. 2021. Abusive Language on Social Media Through the Legal Looking Glass. In Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021), pages 191–200, Online. Association for Computational Linguistics.
Cite (Informal):
Abusive Language on Social Media Through the Legal Looking Glass (Bertaglia et al., WOAH 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.woah-1.20.pdf
Video:
 https://aclanthology.org/2021.woah-1.20.mp4
Code
 thalesbertaglia/alyt
Data
OLID