Improving Situated Conversational Agents with Step-by-Step Multi-modal Logic Reasoning

Yuxing Long; Huibin Zhang; Binyuan Hui; Zhenglu Yang; Caixia Yuan; Xiaojie Wang; Fei Huang; Yongbin Li

Improving Situated Conversational Agents with Step-by-Step Multi-modal Logic Reasoning

Yuxing Long, Huibin Zhang, Binyuan Hui, Zhenglu Yang, Caixia Yuan, Xiaojie Wang, Fei Huang, Yongbin Li

Abstract

To fulfill complex user requirements in a situated conversational scenario, the agent needs to conduct step-by-step multi-modal logic reasoning, which includes locating objects, querying information and searching objects. However, existing methods omit this multi-step procedure and therefore constitutes the risk of shortcuts when making predictions. For example, they may directly copy the information from the dialogue history or simply use the textual description without perform visual reasoning. To address this issue and further boost the system performance, we apply the dual process theory to plug a reasoner into the original transformer based model for step-by-step reasoning. When system 2 completes multi-step reasoning, its output is regarded as final prediction. Our proposed method achieved the 1st rank on the summing scores across all four DSTC-11 SIMMC 2.1 sub-tasks.

Anthology ID:: 2023.dstc-1.3
Volume:: Proceedings of the Eleventh Dialog System Technology Challenge
Month:: September
Year:: 2023
Address:: Prague, Czech Republic
Editors:: Yun-Nung Chen, Paul Crook, Michel Galley, Sarik Ghazarian, Chulaka Gunasekara, Raghav Gupta, Behnam Hedayatnia, Satwik Kottur, Seungwhan Moon, Chen Zhang
Venues:: DSTC | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 15–24
Language:
URL:: https://aclanthology.org/2023.dstc-1.3/
DOI:
Bibkey:
Cite (ACL):: Yuxing Long, Huibin Zhang, Binyuan Hui, Zhenglu Yang, Caixia Yuan, Xiaojie Wang, Fei Huang, and Yongbin Li. 2023. Improving Situated Conversational Agents with Step-by-Step Multi-modal Logic Reasoning. In Proceedings of the Eleventh Dialog System Technology Challenge, pages 15–24, Prague, Czech Republic. Association for Computational Linguistics.
Cite (Informal):: Improving Situated Conversational Agents with Step-by-Step Multi-modal Logic Reasoning (Long et al., DSTC 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.dstc-1.3.pdf

PDF Cite Search Fix data