UWBA at SemEval-2024 Task 3: Dialogue Representation and Multimodal Fusion for Emotion Cause Analysis

Josef Baloun, Jiri Martinek, Ladislav Lenc, Pavel Kral, Matěj Zeman, Lukáš Vlček


Abstract
In this paper, we present an approach for solving SemEval-2024 Task 3: The Competition of Multimodal Emotion Cause Analysis in Conversations. The task includes two subtasks that focus on emotion-cause pair extraction using text, video, and audio modalities. Our approach is composed of encoding all modalities (MFCC and Wav2Vec for audio, 3D-CNN for video, and transformer-based models for text) and combining them in an utterance-level fusion module. The model is then optimized for link and emotion prediction simultaneously. Our approach achieved 6th place in both subtasks. The full leaderboard can be found at https://codalab.lisn.upsaclay.fr/competitions/16141#results
Anthology ID:
2024.semeval-1.49
Volume:
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
316–325
Language:
URL:
https://aclanthology.org/2024.semeval-1.49
DOI:
10.18653/v1/2024.semeval-1.49
Bibkey:
Cite (ACL):
Josef Baloun, Jiri Martinek, Ladislav Lenc, Pavel Kral, Matěj Zeman, and Lukáš Vlček. 2024. UWBA at SemEval-2024 Task 3: Dialogue Representation and Multimodal Fusion for Emotion Cause Analysis. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 316–325, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
UWBA at SemEval-2024 Task 3: Dialogue Representation and Multimodal Fusion for Emotion Cause Analysis (Baloun et al., SemEval 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.semeval-1.49.pdf
Supplementary material:
 2024.semeval-1.49.SupplementaryMaterial.txt