Social Genome: Grounded Social Reasoning Abilities of Multimodal Models

Leena Mathur; Marian Qian; Paul Pu Liang; Louis-Philippe Morency

doi:10.18653/v1/2025.emnlp-main.1264

Social Genome: Grounded Social Reasoning Abilities of Multimodal Models

Leena Mathur, Marian Qian, Paul Pu Liang, Louis-Philippe Morency

Abstract

Social reasoning abilities are crucial for AI systems to effectively interpret and respond to multimodal human communication and interaction within social contexts. We introduce Social Genome, the first benchmark for fine-grained, grounded social reasoning abilities of multimodal models. Social Genome contains 272 videos of interactions and 1,486 human-annotated reasoning traces related to inferences about these interactions. These traces contain 5,777 reasoning steps that reference evidence from visual cues, verbal cues, vocal cues, and external knowledge (contextual knowledge external to videos). Social Genome is also the first modeling challenge to study external knowledge in social reasoning. Social Genome computes metrics to holistically evaluate semantic and structural qualities of model-generated social reasoning traces. We demonstrate the utility of Social Genome through experiments with state-of-the-art models, identifying performance gaps and opportunities for future research to improve the grounded social reasoning abilities of multimodal models.

Anthology ID:: 2025.emnlp-main.1264
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 24868–24891
Language:
URL:: https://aclanthology.org/2025.emnlp-main.1264/
DOI:: 10.18653/v1/2025.emnlp-main.1264
Bibkey:
Cite (ACL):: Leena Mathur, Marian Qian, Paul Pu Liang, and Louis-Philippe Morency. 2025. Social Genome: Grounded Social Reasoning Abilities of Multimodal Models. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 24868–24891, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Social Genome: Grounded Social Reasoning Abilities of Multimodal Models (Mathur et al., EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-main.1264.pdf
Checklist:: 2025.emnlp-main.1264.checklist.pdf

PDF Cite Search Checklist Fix data