Resource-Limited Joint Multimodal Sentiment Reasoning and Classification via Chain-of-Thought Enhancement and Distillation

Haonan Shangguan; Xiaocui Yang; Shi Feng; Daling Wang; Yifei Zhang; Feiliang Ren; Ge Yu (于戈)

Resource-Limited Joint Multimodal Sentiment Reasoning and Classification via Chain-of-Thought Enhancement and Distillation

Haonan Shangguan, Xiaocui Yang, Shi Feng, Daling Wang, Yifei Zhang, Feiliang Ren, Ge Yu

Abstract

Current approaches for Multimodal Sentiment Analysis (MSA) primarily leverage the knowledge and reasoning capabilities of parameter-heavy (Multimodal) LLMs for classification, overlooking autonomous multimodal sentiment reasoning generation in resource-constrained environments.In this paper, we focus on the Resource-Limited Joint Multimodal Sentiment Reasoning and Classification task, JMSRC, which simultaneously performs multimodal sentiment reasoning chain generation and sentiment classification only with a lightweight model.We propose a Multimodal Chain-of-Thought Reasoning Distillation model, MulCoT-RD, designed for JMSRC that employs a "Teacher-Assistant-Student" distillation paradigm to address deployment constraints in resource-limited environments.We first leverage a high-performance Multimodal Large Language Model (MLLM) to generate the initial reasoning dataset and train a medium-sized assistant model with a multi-task learning mechanism. A lightweight student model is jointly trained to perform efficient multimodal sentiment reasoning generation and classification.Extensive experiments on four datasets demonstrate that MulCoT-RD with only 3B parameters and achieves strong performance on JMSRC, while exhibiting robust generalization and enhanced interpretability.

Anthology ID:: 2026.findings-acl.784
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 15971–15986
Language:
URL:: https://aclanthology.org/2026.findings-acl.784/
DOI:
Bibkey:
Cite (ACL):: Haonan Shangguan, Xiaocui Yang, Shi Feng, Daling Wang, Yifei Zhang, Feiliang Ren, and Ge Yu. 2026. Resource-Limited Joint Multimodal Sentiment Reasoning and Classification via Chain-of-Thought Enhancement and Distillation. In Findings of the Association for Computational Linguistics: ACL 2026, pages 15971–15986, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Resource-Limited Joint Multimodal Sentiment Reasoning and Classification via Chain-of-Thought Enhancement and Distillation (Shangguan et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.784.pdf
Checklist:: 2026.findings-acl.784.checklist.pdf

PDF Cite Search Checklist Fix data