Creating a Data Set of Abstractive Summaries of Turn-labeled Spoken Human-Computer Conversations

Iris Hendrickx


Abstract
Digital recorded written and spoken dialogues are becoming increasingly available as an effect of the technological advances such as online messenger services and the use of chatbots. Summaries are a natural way of presenting the important information gathered from dialogues. We present a unique data set that consists of Dutch spoken human-computer conversations, an annotation layer of turn labels, and conversational abstractive summaries of user answers. The data set is publicly available for research purposes.
Anthology ID:
2022.lrec-1.240
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
2236–2244
Language:
URL:
https://aclanthology.org/2022.lrec-1.240
DOI:
Bibkey:
Cite (ACL):
Iris Hendrickx. 2022. Creating a Data Set of Abstractive Summaries of Turn-labeled Spoken Human-Computer Conversations. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 2236–2244, Marseille, France. European Language Resources Association.
Cite (Informal):
Creating a Data Set of Abstractive Summaries of Turn-labeled Spoken Human-Computer Conversations (Hendrickx, LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.240.pdf