Content Moderation in Online Platforms: A Study of Annotation Methods for Inappropriate Language

Baran Barbarestani; Isa Maks; Piek T.J.M. Vossen

Content Moderation in Online Platforms: A Study of Annotation Methods for Inappropriate Language

Baran Barbarestani, Isa Maks, Piek T.J.M. Vossen

Abstract

Detecting inappropriate language in online platforms is vital for maintaining a safe and respectful digital environment, especially in the context of hate speech prevention. However, defining what constitutes inappropriate language can be highly subjective and context-dependent, varying from person to person. This study presents the outcomes of a comprehensive examination of the subjectivity involved in assessing inappropriateness within conversational contexts. Different annotation methods, including expert annotation, crowd annotation, ChatGPT-generated annotation, and lexicon-based annotation, were applied to English Reddit conversations. The analysis revealed a high level of agreement across these annotation methods, with most disagreements arising from subjective interpretations of inappropriate language. This emphasizes the importance of implementing content moderation systems that not only recognize inappropriate content but also understand and adapt to diverse user perspectives and contexts. The study contributes to the evolving field of hate speech annotation by providing a detailed analysis of annotation differences in relation to the subjective task of judging inappropriate words in conversations.

Anthology ID:: 2024.trac-1.11
Volume:: Proceedings of the Fourth Workshop on Threat, Aggression & Cyberbullying @ LREC-COLING-2024
Month:: May
Year:: 2024
Address:: Torino, Italia
Editors:: Ritesh Kumar, Atul Kr. Ojha, Shervin Malmasi, Bharathi Raja Chakravarthi, Bornini Lahiri, Siddharth Singh, Shyam Ratan
Venues:: TRAC | WS
SIG:
Publisher:: ELRA and ICCL
Note:
Pages:: 96–104
Language:
URL:: https://aclanthology.org/2024.trac-1.11/
DOI:
Bibkey:
Cite (ACL):: Baran Barbarestani, Isa Maks, and Piek T.J.M. Vossen. 2024. Content Moderation in Online Platforms: A Study of Annotation Methods for Inappropriate Language. In Proceedings of the Fourth Workshop on Threat, Aggression & Cyberbullying @ LREC-COLING-2024, pages 96–104, Torino, Italia. ELRA and ICCL.
Cite (Informal):: Content Moderation in Online Platforms: A Study of Annotation Methods for Inappropriate Language (Barbarestani et al., TRAC 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.trac-1.11.pdf

PDF Cite Search Fix data