MODDP: A Multi-modal Open-domain Chinese Dataset for Dialogue Discourse Parsing

Chen Gong, DeXin Kong, Suxian Zhao, Xingyu Li, Guohong Fu


Abstract
Dialogue discourse parsing (DDP) aims to capture the relations between utterances in the dialogue. In everyday real-world scenarios, dialogues are typically multi-modal and cover open-domain topics. However, most existing widely used benchmark datasets for DDP contain only textual modality and are domain-specific. This makes it challenging to accurately and comprehensively understand the dialogue without multi-modal clues, and prevents them from capturing the discourse structures of the more prevalent daily conversations. This paper proposes MODDP, the first multi-modal Chinese discourse parsing dataset derived from open-domain daily dialogues, consisting 864 dialogues and 18,114 utterances, accompanied by 12.7 hours of video clips. We present a simple yet effective benchmark approach for multi-modal DDP. Through extensive experiments, we present several benchmark results based on MODDP. The significant improvement in performance from introducing multi-modalities into the original textual unimodal DDP model demonstrates the necessity of integrating multi-modalities into DDP.
Anthology ID:
2024.findings-acl.628
Volume:
Findings of the Association for Computational Linguistics ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand and virtual meeting
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10561–10573
Language:
URL:
https://aclanthology.org/2024.findings-acl.628
DOI:
Bibkey:
Cite (ACL):
Chen Gong, DeXin Kong, Suxian Zhao, Xingyu Li, and Guohong Fu. 2024. MODDP: A Multi-modal Open-domain Chinese Dataset for Dialogue Discourse Parsing. In Findings of the Association for Computational Linguistics ACL 2024, pages 10561–10573, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
MODDP: A Multi-modal Open-domain Chinese Dataset for Dialogue Discourse Parsing (Gong et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.628.pdf