Fast Retrieval and Slow Reasoning for Explainable Multimodal Sentiment Analysis

Aoqiang Zhu; Min Hu; Yan Xing

Fast Retrieval and Slow Reasoning for Explainable Multimodal Sentiment Analysis

Abstract

Most existing Multimodal Sentiment Analysis (MSA) methods rely on holistic fusion, treating all modalities and temporal segments equally. Such strategies often introduce redundant information and obscure the decision process, limiting both robustness and interpretability. Inspired by dual-process theory, we propose FRSR (Fast Retrieval and Slow Reasoning), an interpretable framework that decomposes multimodal sentiment modeling into two cooperative pathways. The Fast Pathway acts as a lightweight evidence selector, using context-aware convolution and auxiliary supervision to retrieve a sparse set of Top-K sentiment-relevant cues from noisy multimodal inputs. Based on these cues, the Slow Pathway performs deeper cross-modal reasoning through learnable reasoning tokens, enabling hierarchical sentiment inference. By separating salient evidence retrieval from multimodal reasoning, FRSR improves interpretability while reducing computational cost. Experiments on three benchmark datasets show that FRSR achieves competitive performance, higher efficiency, stronger robustness to noise, and clearer decision transparency than existing holistic fusion methods.

Anthology ID:: 2026.findings-acl.1519
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 30381–30391
Language:
URL:: https://aclanthology.org/2026.findings-acl.1519/
DOI:
Bibkey:
Cite (ACL):: Aoqiang Zhu, Min Hu, and Yan Xing. 2026. Fast Retrieval and Slow Reasoning for Explainable Multimodal Sentiment Analysis. In Findings of the Association for Computational Linguistics: ACL 2026, pages 30381–30391, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Fast Retrieval and Slow Reasoning for Explainable Multimodal Sentiment Analysis (Zhu et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.1519.pdf
Checklist:: 2026.findings-acl.1519.checklist.pdf

PDF Cite Search Checklist Fix data