Multimodal Persona Based Generation of Comic Dialogs

Harsh Agrawal, Aditya Mishra, Manish Gupta, Mausam


Abstract
We focus on the novel problem of persona based dialogue generation for comic strips. Dialogs in comic strips is a unique and unexplored area where every strip contains utterances from various characters with each one building upon the previous utterances and the associated visual scene. Previous works like DialoGPT, PersonaGPT and other dialog generation models encode two-party dialogues and do not account for the visual information. To the best of our knowledge we are the first to propose the paradigm of multimodal persona based dialogue generation. We contribute a novel dataset, ComSet, consisting of 54K strips, harvested from 13 popular comics available online. Further, we propose a multimodal persona-based architecture, MPDialog, to generate dialogues for the next panel in the strip which decreases the perplexity score by ~10 points over strong dialogue generation baseline models. We demonstrate that there is still ample opportunity for improvement, highlighting the importance of building stronger dialogue systems that are able to generate persona-consistent dialogues and understand the context through various modalities.
Anthology ID:
2023.acl-long.791
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14150–14164
Language:
URL:
https://aclanthology.org/2023.acl-long.791
DOI:
10.18653/v1/2023.acl-long.791
Bibkey:
Cite (ACL):
Harsh Agrawal, Aditya Mishra, Manish Gupta, and Mausam. 2023. Multimodal Persona Based Generation of Comic Dialogs. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 14150–14164, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Multimodal Persona Based Generation of Comic Dialogs (Agrawal et al., ACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.acl-long.791.pdf