Multimodal Emotion Recognition in Conversations: A Survey of Methods, Trends, Challenges and Prospects

ChengYan Wu; Yiqiang Cai; Yang Liu; Pengxu Zhu; Yun Xue (薛云); Ziwei Gong; Julia Hirschberg; Bolei Ma

doi:10.18653/v1/2025.findings-emnlp.332

Multimodal Emotion Recognition in Conversations: A Survey of Methods, Trends, Challenges and Prospects

ChengYan Wu, Yiqiang Cai, Yang Liu, Pengxu Zhu, Yun Xue, Ziwei Gong, Julia Hirschberg, Bolei Ma

Abstract

While text-based emotion recognition methods have achieved notable success, real-world dialogue systems often demand a more nuanced emotional understanding than any single modality can offer. Multimodal Emotion Recognition in Conversations (MERC) has thus emerged as a crucial direction for enhancing the naturalness and emotional understanding of human-computer interaction. Its goal is to accurately recognize emotions by integrating information from various modalities such as text, speech, and visual signals. This survey offers a systematic overview of MERC, including its motivations, core tasks, representative methods, and evaluation strategies. We further examine recent trends, highlight key challenges, and outline future directions. As interest in emotionally intelligent systems grows, this survey provides timely guidance for advancing MERC research.

Anthology ID:: 2025.findings-emnlp.332
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6257–6274
Language:
URL:: https://aclanthology.org/2025.findings-emnlp.332/
DOI:: 10.18653/v1/2025.findings-emnlp.332
Bibkey:
Cite (ACL):: ChengYan Wu, Yiqiang Cai, Yang Liu, Pengxu Zhu, Yun Xue, Ziwei Gong, Julia Hirschberg, and Bolei Ma. 2025. Multimodal Emotion Recognition in Conversations: A Survey of Methods, Trends, Challenges and Prospects. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 6257–6274, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Multimodal Emotion Recognition in Conversations: A Survey of Methods, Trends, Challenges and Prospects (Wu et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-emnlp.332.pdf
Checklist:: 2025.findings-emnlp.332.checklist.pdf

PDF Cite Search Checklist Fix data