When Relations Break: Analyzing Relation Hallucination in Vision-Language Model Under Rotation and Noise

Philip Wootaek Shin; Ajay Narayanan Sridhar; Lakshmi Sivani Devarapalli; Rui Zhang; Jack Sampson; Vijaykrishnan Narayanan

When Relations Break: Analyzing Relation Hallucination in Vision-Language Model Under Rotation and Noise

Philip Wootaek Shin, Ajay Narayanan Sridhar, Lakshmi Sivani Devarapalli, Rui Zhang, Jack Sampson, Vijaykrishnan Narayanan

Abstract

Vision–language models (VLMs) achieve strong multimodal performance but remain prone to relation hallucination, which requires accurate reasoning over inter-object interactions. We study the impact of visual perturbations, specifically rotation and noise, and show that even mild distortions significantly degrade relational reasoning across models and datasets. We further evaluate prompt-based augmentation and preprocessing strategies (orientation correction and denoising), finding that while they offer partial improvements, they do not fully resolve hallucinations. Our results reveal a gap between perceptual robustness and relational understanding, highlighting the need for more robust, geometry-aware VLMs.

Anthology ID:: 2026.alvr-main.14
Volume:: Proceedings of the 4th Workshop on Advances in Language and Vision Research (ALVR)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Qianqi Yan, Syrielle Montariol, Yue Fan, Jing Gu, Jiayi Pan, Manling Li, Parisa Kordjamshidi, Alane Suhr, Xin Eric Wang
Venues:: ALVR | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 180–185
Language:
URL:: https://aclanthology.org/2026.alvr-main.14/
DOI:
Bibkey:
Cite (ACL):: Philip Wootaek Shin, Ajay Narayanan Sridhar, Lakshmi Sivani Devarapalli, Rui Zhang, Jack Sampson, and Vijaykrishnan Narayanan. 2026. When Relations Break: Analyzing Relation Hallucination in Vision-Language Model Under Rotation and Noise. In Proceedings of the 4th Workshop on Advances in Language and Vision Research (ALVR), pages 180–185, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: When Relations Break: Analyzing Relation Hallucination in Vision-Language Model Under Rotation and Noise (Shin et al., ALVR 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.alvr-main.14.pdf

PDF Cite Search Fix data