AICA-Bench: Holistically Examining the Capabilities of VLMs in Affective Image Content Analysis

Dong She; Xianrong Yao; Liqun Chen; Jinghe Yu; Yang Gao; Zhanpeng Jin

AICA-Bench: Holistically Examining the Capabilities of VLMs in Affective Image Content Analysis

Dong She, Xianrong Yao, Liqun Chen, Jinghe Yu, Yang Gao, Zhanpeng Jin

Abstract

Vision-Language Models (VLMs) have demonstrated strong capabilities in perception, yet holistic Affective Image Content Analysis (AICA)—which integrates perception, reasoning, and generation into a unified framework—remains underexplored. To address this, we introduce AICA-Bench, a comprehensive benchmark comprising three core tasks: Emotion Understanding (EU), Reasoning (ER), and Generation (EGCG). We evaluate 23 VLMs, revealing critical gaps: models struggle with intensity calibration and suffer from descriptive shallowness in open-ended tasks. To bridge these gaps, we propose Grounded Affective Tree (GAT) Prompting, a training-free framework that integrates visual scaffolding with hierarchical reasoning. Experiments show that GAT effectively corrects intensity errors and significantly enhances descriptive depth, establishing a robust baseline for future affective multimodal research.

Anthology ID:: 2026.findings-acl.661
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 13501–13528
Language:
URL:: https://aclanthology.org/2026.findings-acl.661/
DOI:
Bibkey:
Cite (ACL):: Dong She, Xianrong Yao, Liqun Chen, Jinghe Yu, Yang Gao, and Zhanpeng Jin. 2026. AICA-Bench: Holistically Examining the Capabilities of VLMs in Affective Image Content Analysis. In Findings of the Association for Computational Linguistics: ACL 2026, pages 13501–13528, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: AICA-Bench: Holistically Examining the Capabilities of VLMs in Affective Image Content Analysis (She et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.661.pdf
Checklist:: 2026.findings-acl.661.checklist.pdf

PDF Cite Search Checklist Fix data