DeepPavlov at SemEval-2024 Task 3: Multimodal Large Language Models in Emotion Reasoning

Julia Belikova; Dmitrii Kosenko

doi:10.18653/v1/2024.semeval-1.249

DeepPavlov at SemEval-2024 Task 3: Multimodal Large Language Models in Emotion Reasoning

Abstract

This paper presents the solution of the DeepPavlov team for the Multimodal Sentiment Cause Analysis competition in SemEval-2024 Task 3, Subtask 2 (Wang et al., 2024). In the evaluation leaderboard, our approach ranks 7th with an F1-score of 0.2132. Large Language Models (LLMs) are transformative in their ability to comprehend and generate human-like text. With recent advancements, Multimodal Large Language Models (MLLMs) have expanded LLM capabilities, integrating different modalities such as audio, vision, and language. Our work delves into the state-of-the-art MLLM Video-LLaMA, its associated modalities, and its application to the emotion reasoning downstream task, Multimodal Emotion Cause Analysis in Conversations (MECAC). We investigate the model’s performance in several modes: zero-shot, few-shot, individual embeddings, and fine-tuned, providing insights into their limits and potential enhancements for emotion understanding.

Anthology ID:: 2024.semeval-1.249
Volume:: Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
Venue:: SemEval
SIG:: SIGLEX
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1747–1757
Language:
URL:: https://aclanthology.org/2024.semeval-1.249/
DOI:: 10.18653/v1/2024.semeval-1.249
Bibkey:
Cite (ACL):: Julia Belikova and Dmitrii Kosenko. 2024. DeepPavlov at SemEval-2024 Task 3: Multimodal Large Language Models in Emotion Reasoning. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 1747–1757, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: DeepPavlov at SemEval-2024 Task 3: Multimodal Large Language Models in Emotion Reasoning (Belikova & Kosenko, SemEval 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.semeval-1.249.pdf
Supplementarymaterial:: 2024.semeval-1.249.SupplementaryMaterial.zip
Supplementarymaterial:: 2024.semeval-1.249.SupplementaryMaterial.txt

PDF Cite Search Supplementarymaterial Supplementarymaterial Fix data