Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models

Holy Lovenia; Wenliang Dai; Samuel Cahyawijaya; Ziwei Ji; Pascale Fung

doi:10.18653/v1/2024.alvr-1.4

Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models

Holy Lovenia, Wenliang Dai, Samuel Cahyawijaya, Ziwei Ji, Pascale Fung

Abstract

Object hallucination poses a significant challenge in vision-language (VL) models, often leading to the generation of nonsensical or unfaithful responses with non-existent objects. However, the absence of a general measurement for evaluating object hallucination in VL models has hindered our understanding and ability to mitigate this issue. In this work, we present NOPE (Negative Object Presence Evaluation), a novel benchmark designed to assess object hallucination in VL models through visual question answering (VQA). We propose a cost-effective and scalable approach utilizing large language models to generate 29.5k synthetic negative pronoun (NegP) data of high quality for NOPE. We extensively investigate the performance of 10 state-of-the-art VL models in discerning the non-existence of objects in visual questions, where the ground truth answers are denoted as (e.g., “none”). Additionally, we evaluate their standard performance on visual questions on 9 other VQA datasets. Through our experiments, we demonstrate that no VL model is immune to the vulnerability of object hallucination, as all models achieve accuracy below 10% on NegP. Furthermore, we uncover that lexically diverse visual questions, question types with large scopes, and scene-relevant objects capitalize the risk of object hallucination in VL models.

Anthology ID:: 2024.alvr-1.4
Volume:: Proceedings of the 3rd Workshop on Advances in Language and Vision Research (ALVR)
Month:: August
Year:: 2024
Address:: Bangkok, Thailand
Editors:: Jing Gu, Tsu-Jui (Ray) Fu, Drew Hudson, Asli Celikyilmaz, William Wang
Venues:: ALVR | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 37–58
Language:
URL:: https://aclanthology.org/2024.alvr-1.4/
DOI:: 10.18653/v1/2024.alvr-1.4
Bibkey:
Cite (ACL):: Holy Lovenia, Wenliang Dai, Samuel Cahyawijaya, Ziwei Ji, and Pascale Fung. 2024. Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models. In Proceedings of the 3rd Workshop on Advances in Language and Vision Research (ALVR), pages 37–58, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):: Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models (Lovenia et al., ALVR 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.alvr-1.4.pdf

PDF Cite Search Fix data