Overview of Situated and Interactive Multimodal Conversations (SIMMC) 2.1 Track at DSTC 11

Satwik Kottur, Seungwhan Moon


Abstract
With ever increasing interest in task-oriented dialog systems, the recent work on Situated and Interactive Multimodal Conversations (SIMMC 2.0) aims to develop personal assistants that interact with users, grounded in an immersive and co-observed setting of photo-realistic scenes. The dataset contains 11k task-oriented dialogs set in an interactive shopping scenario, spanning more than 117k utterances. In order to push research towards this next generation virtual assistants, the SIMMC 2.1 challenge was conducted at the Eleventh Dialog System Technology Challenge (DSTC) which had entries from across the world competing to achieve the state-of-the-art performance in the SIMMC 2.1 task. In this report, we present and compare 13 SIMMC 2.1 model entries from 5 trams across the world to understand the current progress made across the last three years (starting with SIMMC 1.0 and 2.0 challenges) for multimodal task-oriented dialog systems. We hope that our analysis throws light on components that showed promise in addition to identifying the gaps for future research towards this grand goal of an immersive multimodal conversational agent.
Anthology ID:
2023.dstc-1.26
Volume:
Proceedings of The Eleventh Dialog System Technology Challenge
Month:
September
Year:
2023
Address:
Prague, Czech Republic
Editors:
Yun-Nung Chen, Paul Crook, Michel Galley, Sarik Ghazarian, Chulaka Gunasekara, Raghav Gupta, Behnam Hedayatnia, Satwik Kottur, Seungwhan Moon, Chen Zhang
Venues:
DSTC | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
235–241
Language:
URL:
https://aclanthology.org/2023.dstc-1.26
DOI:
Bibkey:
Cite (ACL):
Satwik Kottur and Seungwhan Moon. 2023. Overview of Situated and Interactive Multimodal Conversations (SIMMC) 2.1 Track at DSTC 11. In Proceedings of The Eleventh Dialog System Technology Challenge, pages 235–241, Prague, Czech Republic. Association for Computational Linguistics.
Cite (Informal):
Overview of Situated and Interactive Multimodal Conversations (SIMMC) 2.1 Track at DSTC 11 (Kottur & Moon, DSTC-WS 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.dstc-1.26.pdf