Vision-Language Pre-Training for Multimodal Aspect-Based Sentiment Analysis

Yan Ling, Jianfei Yu, Rui Xia


Abstract
As an important task in sentiment analysis, Multimodal Aspect-Based Sentiment Analysis (MABSA) has attracted increasing attention inrecent years. However, previous approaches either (i) use separately pre-trained visual and textual models, which ignore the crossmodalalignment or (ii) use vision-language models pre-trained with general pre-training tasks, which are inadequate to identify fine-grainedaspects, opinions, and their alignments across modalities. To tackle these limitations, we propose a task-specific Vision-LanguagePre-training framework for MABSA (VLP-MABSA), which is a unified multimodal encoder-decoder architecture for all the pretrainingand downstream tasks. We further design three types of task-specific pre-training tasks from the language, vision, and multimodalmodalities, respectively. Experimental results show that our approach generally outperforms the state-of-the-art approaches on three MABSA subtasks. Further analysis demonstrates the effectiveness of each pre-training task. The source code is publicly released at https://github.com/NUSTM/VLP-MABSA.
Anthology ID:
2022.acl-long.152
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2149–2159
Language:
URL:
https://aclanthology.org/2022.acl-long.152
DOI:
10.18653/v1/2022.acl-long.152
Bibkey:
Cite (ACL):
Yan Ling, Jianfei Yu, and Rui Xia. 2022. Vision-Language Pre-Training for Multimodal Aspect-Based Sentiment Analysis. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2149–2159, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Vision-Language Pre-Training for Multimodal Aspect-Based Sentiment Analysis (Ling et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.152.pdf
Code
 nustm/vlp-mabsa