Dynamic Graph Neural ODE Network for Multi-modal Emotion Recognition in Conversation

Yuntao Shou, Tao Meng, Wei Ai, Keqin Li


Abstract
Multimodal emotion recognition in conversation (MERC) refers to identifying and classifying human emotional states by combining data from multiple different modalities (e.g., audio, images, text, video, etc.). Specifically, human emotional expressions are often complex and diverse, and these complex emotional expressions can be captured and understood more comprehensively through the fusion of multimodal information. Most existing graph-based multimodal emotion recognition methods can only use shallow GCNs to extract emotion features and fail to capture the temporal dependencies caused by dynamic changes in emotions. To address the above problems, we propose a Dynamic Graph Neural Ordinary Differential Equation Network (DGODE) for multimodal emotion recognition in conversation, which combines the dynamic changes of emotions to capture the temporal dependency of speakers’ emotions. Technically, the key idea of DGODE is to use the graph ODE evolution network to characterize the continuous dynamics of node representations over time and capture temporal dependencies. Extensive experiments on two publicly available multimodal emotion recognition datasets demonstrate that the proposed DGODE model has superior performance compared to various baselines. Furthermore, the proposed DGODE can also alleviate the over-smoothing problem, thereby enabling the construction of a deep GCN network.
Anthology ID:
2025.coling-main.18
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
256–268
Language:
URL:
https://aclanthology.org/2025.coling-main.18/
DOI:
Bibkey:
Cite (ACL):
Yuntao Shou, Tao Meng, Wei Ai, and Keqin Li. 2025. Dynamic Graph Neural ODE Network for Multi-modal Emotion Recognition in Conversation. In Proceedings of the 31st International Conference on Computational Linguistics, pages 256–268, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Dynamic Graph Neural ODE Network for Multi-modal Emotion Recognition in Conversation (Shou et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.18.pdf