COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements

Xuhui Zhou, Hao Zhu, Akhila Yerukola, Thomas Davidson, Jena D. Hwang, Swabha Swayamdipta, Maarten Sap


Abstract
Warning: This paper contains content that may be offensive or upsetting. Understanding the harms and offensiveness of statements requires reasoning about the social and situational context in which statements are made. For example, the utterance “your English is very good” may implicitly signal an insult when uttered by a white man to a non-white colleague, but uttered by an ESL teacher to their student would be interpreted as a genuine compliment. Such contextual factors have been largely ignored by previous approaches to toxic language detection. We introduce COBRA frames, the first context-aware formalism for explaining the intents, reactions, and harms of offensive or biased statements grounded in their social and situational context. We create COBRACORPUS, a dataset of 33k potentially offensive statements paired with machine-generated contexts and free-text explanations of offensiveness, implied biases, speaker intents, and listener reactions. To study the contextual dynamics of offensiveness, we train models to generate COBRA explanations, with and without access to the context. We find that explanations by context-agnostic models are significantly worse than by context-aware ones, especially in situations where the context inverts the statement’s offensiveness (29% accuracy drop). Our work highlights the importance and feasibility of contextualized NLP by modeling social factors.
Anthology ID:
2023.findings-acl.392
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6294–6315
Language:
URL:
https://aclanthology.org/2023.findings-acl.392
DOI:
10.18653/v1/2023.findings-acl.392
Bibkey:
Cite (ACL):
Xuhui Zhou, Hao Zhu, Akhila Yerukola, Thomas Davidson, Jena D. Hwang, Swabha Swayamdipta, and Maarten Sap. 2023. COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements. In Findings of the Association for Computational Linguistics: ACL 2023, pages 6294–6315, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements (Zhou et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-acl.392.pdf
Video:
 https://aclanthology.org/2023.findings-acl.392.mp4