Pragmatic Issue-Sensitive Image Captioning

Allen Nie, Reuben Cohn-Gordon, Christopher Potts


Abstract
Image captioning systems need to produce texts that are not only true but also relevant in that they are properly aligned with the current issues. For instance, in a newspaper article about a sports event, a caption that not only identifies the player in a picture but also comments on their ethnicity could create unwanted reader reactions. To address this, we propose Issue-Sensitive Image Captioning (ISIC). In ISIC, the captioner is given a target image and an issue, which is a set of images partitioned in a way that specifies what information is relevant. For the sports article, we could construct a partition that places images into equivalence classes based on player position. To model this task, we use an extension of the Rational Speech Acts model. Our extension is built on top of state-of-the-art pretrained neural image captioners and explicitly uses image partitions to control caption generation. In both automatic and human evaluations, we show that these models generate captions that are descriptive and issue-sensitive. Finally, we show how ISIC can complement and enrich the related task of Visual Question Answering.
Anthology ID:
2020.findings-emnlp.173
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2020
Month:
November
Year:
2020
Address:
Online
Editors:
Trevor Cohn, Yulan He, Yang Liu
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1924–1938
Language:
URL:
https://aclanthology.org/2020.findings-emnlp.173
DOI:
10.18653/v1/2020.findings-emnlp.173
Bibkey:
Cite (ACL):
Allen Nie, Reuben Cohn-Gordon, and Christopher Potts. 2020. Pragmatic Issue-Sensitive Image Captioning. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1924–1938, Online. Association for Computational Linguistics.
Cite (Informal):
Pragmatic Issue-Sensitive Image Captioning (Nie et al., Findings 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.findings-emnlp.173.pdf
Code
 windweller/Pragmatic-ISIC
Data
MS COCOVisual Question AnsweringVisual Question Answering v2.0