PropTest: Automatic Property Testing for Improved Visual Programming

Jaywon Koo; Ziyan Yang; Paola Cascante-Bonilla; Baishakhi Ray; Vicente Ordonez

PropTest: Automatic Property Testing for Improved Visual Programming

Jaywon Koo, Ziyan Yang, Paola Cascante-Bonilla, Baishakhi Ray, Vicente Ordonez

Abstract

Visual Programming has recently emerged as an alternative to end-to-end black-box visual reasoning models. This type of method leverages Large Language Models (LLMs) to generate the source code for an executable computer program that solves a given problem. This strategy has the advantage of offering an interpretable reasoning path and does not require finetuning a model with task-specific data. We propose PropTest, a general strategy that improves visual programming by further using an LLM to generate code that tests for visual properties in an initial round of proposed solutions. Our method generates tests for data-type consistency, output syntax, and semantic properties. PropTest achieves comparable results to state-of-the-art methods while using publicly available LLMs. This is demonstrated across different benchmarks on visual question answering and referring expression comprehension. Particularly, PropTest improves ViperGPT by obtaining 46.1% accuracy (+6.0%) on GQA using Llama3-8B and 59.5% (+8.1%) on RefCOCO+ using CodeLlama-34B.

Anthology ID:: 2024.findings-emnlp.483
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2024
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8241–8256
Language:
URL:: https://aclanthology.org/2024.findings-emnlp.483
DOI:
Bibkey:
Cite (ACL):: Jaywon Koo, Ziyan Yang, Paola Cascante-Bonilla, Baishakhi Ray, and Vicente Ordonez. 2024. PropTest: Automatic Property Testing for Improved Visual Programming. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 8241–8256, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: PropTest: Automatic Property Testing for Improved Visual Programming (Koo et al., Findings 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.findings-emnlp.483.pdf
Software:: 2024.findings-emnlp.483.software.zip

PDF Cite Search Software