Toximatics: Towards Understanding Toxicity in Real-Life Social Situations

Mayukh Das, Wolf-Tilo Balke


Abstract
The proliferation of social media has increased the visibility and effects of hate speech. To address this, NLP solutions have been developed to identify both explicit and implicit forms of hate speech. Typically, these approaches evaluate the toxicity of utterances in isolation, ignoring the context. Drawing on pragmatics, our study examines how contextual factors can influence the perceived toxicity of utterances, thereby anchoring assessments in a more nuanced semantic framework. We present Toximatics, a dataset that includes context-dependent utterances and it’s toxicity score. We also introduce a novel synthetic data generation pipeline designed to create context-utterance pairs at scale with controlled polarity. This pipeline can enhance existing hate speech datasets by adding contextual information to utterances, either preserving or altering their polarity, and also generate completely new pairs from seed statements. We utilised both features to create Toximatics. To address biases in state-of-the-art hate datasets, which often skew towards specific sensitive topics such as politics, race, and gender, we propose a method to generate neutral utterances typical of various social settings. These are then contextualized to show how neutrality can shift to toxicity or benignity depending on the surrounding context. The evaluation results clearly indicate that the current models are underperforming on this dataset.
Anthology ID:
2024.sigdial-1.65
Volume:
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Month:
September
Year:
2024
Address:
Kyoto, Japan
Editors:
Tatsuya Kawahara, Vera Demberg, Stefan Ultes, Koji Inoue, Shikib Mehri, David Howcroft, Kazunori Komatani
Venue:
SIGDIAL
SIG:
SIGDIAL
Publisher:
Association for Computational Linguistics
Note:
Pages:
770–785
Language:
URL:
https://aclanthology.org/2024.sigdial-1.65
DOI:
10.18653/v1/2024.sigdial-1.65
Bibkey:
Cite (ACL):
Mayukh Das and Wolf-Tilo Balke. 2024. Toximatics: Towards Understanding Toxicity in Real-Life Social Situations. In Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 770–785, Kyoto, Japan. Association for Computational Linguistics.
Cite (Informal):
Toximatics: Towards Understanding Toxicity in Real-Life Social Situations (Das & Balke, SIGDIAL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.sigdial-1.65.pdf