Divide and Conquer: Rethinking Ambiguous Candidate Identification in Multimodal Dialogues with Pseudo-Labelling

Bhathiya Hemanthage, Christian Dondrup, Hakan Bilen, Oliver Lemon


Abstract
Ambiguous Candidate Identification(ACI) in multimodal dialogue is the task of identifying all potential objects that a user’s utterance could be referring to in a visual scene, in cases where the reference cannot be uniquely determined. End-to-end models are the dominant approach for this task, but have limited real-world applicability due to unrealistic inference-time assumptions such as requiring predefined catalogues of items. Focusing on a more generalized and realistic ACI setup, we demonstrate that a modular approach, which first emphasizes language-only reasoning over dialogue context before performing vision-language fusion, significantly outperforms end-to-end trained baselines. To mitigate the lack of annotations for training the language-only module (student), we propose a pseudo-labelling strategy with a prompted Large Language Model (LLM) as the teacher.
Anthology ID:
2024.sigdial-1.20
Volume:
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Month:
September
Year:
2024
Address:
Kyoto, Japan
Editors:
Tatsuya Kawahara, Vera Demberg, Stefan Ultes, Koji Inoue, Shikib Mehri, David Howcroft, Kazunori Komatani
Venue:
SIGDIAL
SIG:
SIGDIAL
Publisher:
Association for Computational Linguistics
Note:
Pages:
222–227
Language:
URL:
https://aclanthology.org/2024.sigdial-1.20
DOI:
Bibkey:
Cite (ACL):
Bhathiya Hemanthage, Christian Dondrup, Hakan Bilen, and Oliver Lemon. 2024. Divide and Conquer: Rethinking Ambiguous Candidate Identification in Multimodal Dialogues with Pseudo-Labelling. In Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 222–227, Kyoto, Japan. Association for Computational Linguistics.
Cite (Informal):
Divide and Conquer: Rethinking Ambiguous Candidate Identification in Multimodal Dialogues with Pseudo-Labelling (Hemanthage et al., SIGDIAL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.sigdial-1.20.pdf