EmpathicStories++: A Multimodal Dataset for Empathy Towards Personal Experiences

Jocelyn Shen, Yubin Kim, Mohit Hulse, Wazeer Zulfikar, Sharifa Alghowinem, Cynthia Breazeal, Hae Park


Abstract
Modeling empathy is a complex endeavor that is rooted in interpersonal and experiential dimensions of human interaction, and remains an open problem within AI. Existing empathy datasets fall short in capturing the richness of empathy responses, often being confined to in-lab or acted scenarios, lacking longitudinal data, and missing self-reported labels. We introduce a new multimodal dataset for empathy during personal experience sharing: the EmpathicStories++ dataset containing 53 hours of video, audio, and text data of 41 participants sharing vulnerable experiences and reading empathically resonant stories with an AI agent. EmpathicStories++ is the first longitudinal dataset on empathy, collected over a month-long deployment of social robots in participants’ homes, as participants engage in natural, empathic storytelling interactions with AI agents. We then introduce a novel task of predicting individuals’ empathy toward others’ stories based on their personal experiences, evaluated in two contexts: participants’ own personal shared story context and their reflections on stories they read. We benchmark this task using state-of-the-art models to pave the way for future improvements in contextualized and longitudinal empathy modeling. Our work provides a valuable resource for further research in developing empathetic AI systems and understanding the intricacies of human empathy within genuine, real-world settings.
Anthology ID:
2024.findings-acl.268
Volume:
Findings of the Association for Computational Linguistics ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand and virtual meeting
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4525–4536
Language:
URL:
https://aclanthology.org/2024.findings-acl.268
DOI:
Bibkey:
Cite (ACL):
Jocelyn Shen, Yubin Kim, Mohit Hulse, Wazeer Zulfikar, Sharifa Alghowinem, Cynthia Breazeal, and Hae Park. 2024. EmpathicStories++: A Multimodal Dataset for Empathy Towards Personal Experiences. In Findings of the Association for Computational Linguistics ACL 2024, pages 4525–4536, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
EmpathicStories++: A Multimodal Dataset for Empathy Towards Personal Experiences (Shen et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.268.pdf