Danger Depends on the Mind: A Theory-of-Mind Grounded Dataset and Model for Context-Dependent Dangerous Speech

Yuanchen Shi; Longyin Zhang; Guodong Zhou (周国栋); Fang Kong (孔芳)

Danger Depends on the Mind: A Theory-of-Mind Grounded Dataset and Model for Context-Dependent Dangerous Speech

Yuanchen Shi, Longyin Zhang, Guodong Zhou, Fang Kong

Abstract

Dangerous speech detection is a well-studied task, but existing approaches typically treat utterances in isolation, relying on binary labels that ignore who is speaking and in what mental state. We formulate a context-dependent variant of this task by grounding it in Theory-of-Mind (ToM). In cognitive science, ToM studies how humans attribute latent mental states-such as emotions, intentions, and actions-to others. We argue that such states are key signals for assessing the risk of an utterance. Building on this view, we construct ToM-DS, a 79K-instance dataset where each utterance is paired with structured speaker profiles, ToM states (emotion, intent, action), and topic hierarchies. During data construction, we first identify context-dependent sentences and generate diverse safe and dangerous scenarios surrounding them. High-quality annotations are obtained with state-of-the-art LLMs and a multi-stage cross-agent validation pipeline, yielding a comprehensive and reliable resource for context-dependent dangerous speech detection and fine-grained risk level classification. We further propose ToMGuard, a lightweight model with a dynamic ToM attention mechanism that adaptively weighs different mental-state cues. ToMGuard outperforms strong proprietary and open-source LLMs with significantly fewer parameters. Experimental results show that ToMGuard sets a new benchmark for context-dependent dangerous speech detection and risk level classification on ToM-DS.

Anthology ID:: 2026.findings-acl.322
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6457–6478
Language:
URL:: https://aclanthology.org/2026.findings-acl.322/
DOI:
Bibkey:
Cite (ACL):: Yuanchen Shi, Longyin Zhang, Guodong Zhou, and Fang Kong. 2026. Danger Depends on the Mind: A Theory-of-Mind Grounded Dataset and Model for Context-Dependent Dangerous Speech. In Findings of the Association for Computational Linguistics: ACL 2026, pages 6457–6478, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Danger Depends on the Mind: A Theory-of-Mind Grounded Dataset and Model for Context-Dependent Dangerous Speech (Shi et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.322.pdf
Checklist:: 2026.findings-acl.322.checklist.pdf

PDF Cite Search Checklist Fix data