Divide and Conquer: Rethinking Ambiguous Candidate Identification in Multimodal Dialogues with Pseudo-Labelling

Bhathiya Hemanthage; Christian Dondrup; Hakan Bilen; Oliver Lemon

doi:10.18653/v1/2024.sigdial-1.20

Divide and Conquer: Rethinking Ambiguous Candidate Identification in Multimodal Dialogues with Pseudo-Labelling

Bhathiya Hemanthage, Christian Dondrup, Hakan Bilen, Oliver Lemon

Abstract

Ambiguous Candidate Identification(ACI) in multimodal dialogue is the task of identifying all potential objects that a user’s utterance could be referring to in a visual scene, in cases where the reference cannot be uniquely determined. End-to-end models are the dominant approach for this task, but have limited real-world applicability due to unrealistic inference-time assumptions such as requiring predefined catalogues of items. Focusing on a more generalized and realistic ACI setup, we demonstrate that a modular approach, which first emphasizes language-only reasoning over dialogue context before performing vision-language fusion, significantly outperforms end-to-end trained baselines. To mitigate the lack of annotations for training the language-only module (student), we propose a pseudo-labelling strategy with a prompted Large Language Model (LLM) as the teacher.

Anthology ID:: 2024.sigdial-1.20
Volume:: Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Month:: September
Year:: 2024
Address:: Kyoto, Japan
Editors:: Tatsuya Kawahara, Vera Demberg, Stefan Ultes, Koji Inoue, Shikib Mehri, David Howcroft, Kazunori Komatani
Venue:: SIGDIAL
SIG:: SIGDIAL
Publisher:: Association for Computational Linguistics
Note:
Pages:: 222–227
Language:
URL:: https://aclanthology.org/2024.sigdial-1.20/
DOI:: 10.18653/v1/2024.sigdial-1.20
Bibkey:
Cite (ACL):: Bhathiya Hemanthage, Christian Dondrup, Hakan Bilen, and Oliver Lemon. 2024. Divide and Conquer: Rethinking Ambiguous Candidate Identification in Multimodal Dialogues with Pseudo-Labelling. In Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 222–227, Kyoto, Japan. Association for Computational Linguistics.
Cite (Informal):: Divide and Conquer: Rethinking Ambiguous Candidate Identification in Multimodal Dialogues with Pseudo-Labelling (Hemanthage et al., SIGDIAL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.sigdial-1.20.pdf

PDF Cite Search Fix data