Guido Linders
2018
Grounded Textual Entailment
Hoa Trong Vu
|
Claudio Greco
|
Aliia Erofeeva
|
Somayeh Jafaritazehjan
|
Guido Linders
|
Marc Tanti
|
Alberto Testoni
|
Raffaella Bernardi
|
Albert Gatt
Proceedings of the 27th International Conference on Computational Linguistics
Capturing semantic relations between sentences, such as entailment, is a long-standing challenge for computational semantics. Logic-based models analyse entailment in terms of possible worlds (interpretations, or situations) where a premise P entails a hypothesis H iff in all worlds where P is true, H is also true. Statistical models view this relationship probabilistically, addressing it in terms of whether a human would likely infer H from P. In this paper, we wish to bridge these two perspectives, by arguing for a visually-grounded version of the Textual Entailment task. Specifically, we ask whether models can perform better if, in addition to P and H, there is also an image (corresponding to the relevant “world” or “situation”). We use a multimodal version of the SNLI dataset (Bowman et al., 2015) and we compare “blind” and visually-augmented models of textual entailment. We show that visual information is beneficial, but we also conduct an in-depth error analysis that reveals that current multimodal models are not performing “grounding” in an optimal fashion.
Search
Fix data
Co-authors
- Raffaella Bernardi 1
- Aliia Erofeeva 1
- Albert Gatt 1
- Claudio Greco 1
- Somayeh Jafaritazehjani 1
- show all...