GroupToM-Bench: Benchmarking Group Theory of Mind and Nonlinear Social Emergence in MLLMs

Weidong Tang; Jierui Li; Yueling Hou; Zihan Mei; Can Zhang; Xinyan Wan; Zhiyuan Liang; Pengfei Zhou; Yang You; Wangbo Zhao

GroupToM-Bench: Benchmarking Group Theory of Mind and Nonlinear Social Emergence in MLLMs

Weidong Tang, Jierui Li, Yueling Hou, Zihan Mei, Can Zhang, Xinyan Wan, Zhiyuan Liang, Pengfei Zhou, Yang You, Wangbo Zhao

Abstract

True general intelligence requires not only a model of the physical world but also a social world model: the capacity to infer how individual mental states interact and crystallize into group-level outcomes. Despite notable progress in individual-level Theory of Mind (ToM) reasoning, existing multimodal large language models systematically fail at this: collective behavior emerges non-linearly from social tensions, conformity dynamics, and structural constraints, and cannot be recovered by summing individual intentions. We present ***GroupToM-Bench***, the first multimodal benchmark for group-level ToM, built around a causal chain spanning micro-level BDI states (belief, desire, intention), meso-level group tension and structural constraints, and macro-level outcome prediction and mechanistic attribution. To probe this full arc, we develop a seven-level cognitive audit framework. Experiments reveal that frontier models perform significantly below human levels, exposing fundamental blind spots in modeling social structures and nonlinear collective behavior.

Anthology ID:: 2026.acl-long.1859
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 40007–40031
Language:
URL:: https://aclanthology.org/2026.acl-long.1859/
DOI:
Bibkey:
Cite (ACL):: Weidong Tang, Jierui Li, Yueling Hou, Zihan Mei, Can Zhang, Xinyan Wan, Zhiyuan Liang, Pengfei Zhou, Yang You, and Wangbo Zhao. 2026. GroupToM-Bench: Benchmarking Group Theory of Mind and Nonlinear Social Emergence in MLLMs. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 40007–40031, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: GroupToM-Bench: Benchmarking Group Theory of Mind and Nonlinear Social Emergence in MLLMs (Tang et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.1859.pdf
Checklist:: 2026.acl-long.1859.checklist.pdf

PDF Cite Search Checklist Fix data