Modal Feature Optimization Network with Prompt for Multimodal Sentiment Analysis

Xiangmin Zhang, Wei Wei, Shihao Zou


Abstract
Multimodal sentiment analysis(MSA) is mostly used to understand human emotional states through multimodal. However, due to the fact that the effective information carried by multimodal is not balanced, the modality containing less effective information cannot fully play the complementary role between modalities. Therefore, the goal of this paper is to fully explore the effective information in modalities and further optimize the under-optimized modal representation.To this end, we propose a novel Modal Feature Optimization Network (MFON) with a Modal Prompt Attention (MPA) mechanism for MSA. Specifically, we first determine which modalities are under-optimized in MSA, and then use relevant prompt information to focus the model on these features. This allows the model to focus more on the features of the modalities that need optimization, improving the utilization of each modality’s feature representation and facilitating initial information aggregation across modalities. Subsequently, we design an intra-modal knowledge distillation strategy for under-optimized modalities. This approach preserves the integrity of the modal features. Furthermore, we implement inter-modal contrastive learning to better extract related features across modalities, thereby optimizing the entire network. Finally, sentiment prediction is carried out through the effective fusion of multimodal information. Extensive experimental results on public benchmark datasets demonstrate that our proposed method outperforms existing state-of-the-art models.
Anthology ID:
2025.coling-main.309
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4611–4621
Language:
URL:
https://aclanthology.org/2025.coling-main.309/
DOI:
Bibkey:
Cite (ACL):
Xiangmin Zhang, Wei Wei, and Shihao Zou. 2025. Modal Feature Optimization Network with Prompt for Multimodal Sentiment Analysis. In Proceedings of the 31st International Conference on Computational Linguistics, pages 4611–4621, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Modal Feature Optimization Network with Prompt for Multimodal Sentiment Analysis (Zhang et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.309.pdf