Is a bunch of words enough to detect disagreement in hateful content?

Giulia Rizzi, Paolo Rosso, Elisabetta Fersini


Abstract
The complexity of the annotation process when adopting crowdsourcing platforms for labeling hateful content can be linked to the presence of textual constituents that can be ambiguous, misinterpreted, or characterized by a reduced surrounding context. In this paper, we address the problem of perspectivism in hateful speech by leveraging contextualized embedding representation of their constituents and weighted probability functions. The effectiveness of the proposed approach is assessed using four datasets provided for the SemEval 2023 Task 11 shared task. The results emphasize that a few elements can serve as a proxy to identify sentences that may be perceived differently by multiple readers, without the need of necessarily exploiting complex Large Language Models.
Anthology ID:
2025.comedi-1.1
Volume:
Proceedings of Context and Meaning: Navigating Disagreements in NLP Annotation
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Michael Roth, Dominik Schlechtweg
Venues:
CoMeDi | WS
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
1–11
Language:
URL:
https://aclanthology.org/2025.comedi-1.1/
DOI:
Bibkey:
Cite (ACL):
Giulia Rizzi, Paolo Rosso, and Elisabetta Fersini. 2025. Is a bunch of words enough to detect disagreement in hateful content?. In Proceedings of Context and Meaning: Navigating Disagreements in NLP Annotation, pages 1–11, Abu Dhabi, UAE. International Committee on Computational Linguistics.
Cite (Informal):
Is a bunch of words enough to detect disagreement in hateful content? (Rizzi et al., CoMeDi 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.comedi-1.1.pdf