‘Just because you are right, doesn’t mean I am wrong’: Overcoming a bottleneck in development and evaluation of Open-Ended VQA tasks

Man Luo, Shailaja Keyur Sampat, Riley Tallman, Yankai Zeng, Manuha Vancha, Akarshan Sajja, Chitta Baral


Abstract
GQA (CITATION) is a dataset for real-world visual reasoning and compositional question answering. We found that many answers predicted by the best vision-language models on the GQA dataset do not match the ground-truth answer but still are semantically meaningful and correct in the given context. In fact, this is the case with most existing visual question answering (VQA) datasets where they assume only one ground-truth answer for each question. We propose Alternative Answer Sets (AAS) of ground-truth answers to address this limitation, which is created automatically using off-the-shelf NLP tools. We introduce a semantic metric based on AAS and modify top VQA solvers to support multiple plausible answers for a question. We implement this approach on the GQA dataset and show the performance improvements.
Anthology ID:
2021.eacl-main.240
Volume:
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Month:
April
Year:
2021
Address:
Online
Editors:
Paola Merlo, Jorg Tiedemann, Reut Tsarfaty
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2766–2771
Language:
URL:
https://aclanthology.org/2021.eacl-main.240
DOI:
10.18653/v1/2021.eacl-main.240
Bibkey:
Cite (ACL):
Man Luo, Shailaja Keyur Sampat, Riley Tallman, Yankai Zeng, Manuha Vancha, Akarshan Sajja, and Chitta Baral. 2021. ‘Just because you are right, doesn’t mean I am wrong’: Overcoming a bottleneck in development and evaluation of Open-Ended VQA tasks. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 2766–2771, Online. Association for Computational Linguistics.
Cite (Informal):
‘Just because you are right, doesn’t mean I am wrong’: Overcoming a bottleneck in development and evaluation of Open-Ended VQA tasks (Luo et al., EACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.eacl-main.240.pdf
Data
GQAVisual Question Answering