Interactive Symbol Grounding with Complex Referential Expressions
Rimvydas Rubavicius | Alex Lascarides
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
We present a procedure for learning to ground symbols from a sequence of stimuli consisting of an arbitrarily complex noun phrase (e.g. “all but one green square above both red circles.”) and its designation in the visual scene. Our distinctive approach combines: a) lazy few-shot learning to relate open-class words like green and above to their visual percepts; and b) symbolic reasoning with closed-class word categories like quantifiers and negation. We use this combination to estimate new training examples for grounding symbols that occur within a noun phrase but aren’t designated by that noun phase (e.g, red in the above example), thereby potentially gaining data efficiency. We evaluate the approach in a visual reference resolution task, in which the learner starts out unaware of concepts that are part of the domain model and how they relate to visual percepts.