Self-Contained Utterance Description Corpus for Japanese Dialog

Yuta Hayashibe


Abstract
Often both an utterance and its context must be read to understand its intent in a dialog. Herein we propose a task, Self- Contained Utterance Description (SCUD), to describe the intent of an utterance in a dialog with multiple simple natural sentences without the context. If a task can be performed concurrently with high accuracy as the conversation continues such as in an accommodation search dialog, the operator can easily suggest candidates to the customer by inputting SCUDs of the customer’s utterances to the accommodation search system. SCUDs can also describe the transition of customer requests from the dialog log. We construct a Japanese corpus to train and evaluate automatic SCUD generation. The corpus consists of 210 dialogs containing 10,814 sentences. We conduct an experiment to verify that SCUDs can be automatically generated. Additionally, we investigate the influence of the amount of training data on the automatic generation performance using 8,200 additional examples.
Anthology ID:
2022.lrec-1.133
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
1249–1255
Language:
URL:
https://aclanthology.org/2022.lrec-1.133
DOI:
Bibkey:
Cite (ACL):
Yuta Hayashibe. 2022. Self-Contained Utterance Description Corpus for Japanese Dialog. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 1249–1255, Marseille, France. European Language Resources Association.
Cite (Informal):
Self-Contained Utterance Description Corpus for Japanese Dialog (Hayashibe, LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.133.pdf