Manoj Acharya


2019

pdf bib
VQD: Visual Query Detection In Natural Scenes
Manoj Acharya | Karan Jariwala | Christopher Kanan
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

We propose a new visual grounding task called Visual Query Detection (VQD). In VQD, the task is to localize a variable number of objects in an image where the objects are specified in natural language. VQD is related to visual referring expression comprehension, where the task is to localize only one object. We propose the first algorithms for VQD, and we evaluate them on both visual referring expression datasets and our new VQDv1 dataset.