MultiWOZ 2.4: A Multi-Domain Task-Oriented Dialogue Dataset with Essential Annotation Corrections to Improve State Tracking Evaluation

Fanghua Ye; Jarana Manotumruksa; Emine Yilmaz

doi:10.18653/v1/2022.sigdial-1.34

MultiWOZ 2.4: A Multi-Domain Task-Oriented Dialogue Dataset with Essential Annotation Corrections to Improve State Tracking Evaluation

Fanghua Ye, Jarana Manotumruksa, Emine Yilmaz

Abstract

The MultiWOZ 2.0 dataset has greatly stimulated the research of task-oriented dialogue systems. However, its state annotations contain substantial noise, which hinders a proper evaluation of model performance. To address this issue, massive efforts were devoted to correcting the annotations. Three improved versions (i.e., MultiWOZ 2.1-2.3) have then been released. Nonetheless, there are still plenty of incorrect and inconsistent annotations. This work introduces MultiWOZ 2.4, which refines the annotations in the validation set and test set of MultiWOZ 2.1. The annotations in the training set remain unchanged (same as MultiWOZ 2.1) to elicit robust and noise-resilient model training. We benchmark eight state-of-the-art dialogue state tracking models on MultiWOZ 2.4. All of them demonstrate much higher performance than on MultiWOZ 2.1.

Anthology ID:: 2022.sigdial-1.34
Volume:: Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue
Month:: September
Year:: 2022
Address:: Edinburgh, UK
Editors:: Oliver Lemon, Dilek Hakkani-Tur, Junyi Jessy Li, Arash Ashrafzadeh, Daniel Hernández Garcia, Malihe Alikhani, David Vandyke, Ondřej Dušek
Venue:: SIGDIAL
SIG:: SIGDIAL
Publisher:: Association for Computational Linguistics
Note:
Pages:: 351–360
Language:
URL:: https://aclanthology.org/2022.sigdial-1.34/
DOI:: 10.18653/v1/2022.sigdial-1.34
Bibkey:
Cite (ACL):: Fanghua Ye, Jarana Manotumruksa, and Emine Yilmaz. 2022. MultiWOZ 2.4: A Multi-Domain Task-Oriented Dialogue Dataset with Essential Annotation Corrections to Improve State Tracking Evaluation. In Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 351–360, Edinburgh, UK. Association for Computational Linguistics.
Cite (Informal):: MultiWOZ 2.4: A Multi-Domain Task-Oriented Dialogue Dataset with Essential Annotation Corrections to Improve State Tracking Evaluation (Ye et al., SIGDIAL 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.sigdial-1.34.pdf
Video:: https://youtu.be/mI5UNXEtSTI

PDF Cite Search Video Fix data