Storytelling with Dialogue: A Critical Role Dungeons and Dragons Dataset

Revanth Rameshkumar, Peter Bailey


Abstract
This paper describes the Critical Role Dungeons and Dragons Dataset (CRD3) and related analyses. Critical Role is an unscripted, live-streamed show where a fixed group of people play Dungeons and Dragons, an open-ended role-playing game. The dataset is collected from 159 Critical Role episodes transcribed to text dialogues, consisting of 398,682 turns. It also includes corresponding abstractive summaries collected from the Fandom wiki. The dataset is linguistically unique in that the narratives are generated entirely through player collaboration and spoken interaction. For each dialogue, there are a large number of turns, multiple abstractive summaries with varying levels of detail, and semantic ties to the previous dialogues. In addition, we provide a data augmentation method that produces 34,243 summary-dialogue chunk pairs to support current neural ML approaches, and we provide an abstractive summarization benchmark and evaluation.
Anthology ID:
2020.acl-main.459
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5121–5134
Language:
URL:
https://aclanthology.org/2020.acl-main.459
DOI:
10.18653/v1/2020.acl-main.459
Bibkey:
Cite (ACL):
Revanth Rameshkumar and Peter Bailey. 2020. Storytelling with Dialogue: A Critical Role Dungeons and Dragons Dataset. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5121–5134, Online. Association for Computational Linguistics.
Cite (Informal):
Storytelling with Dialogue: A Critical Role Dungeons and Dragons Dataset (Rameshkumar & Bailey, ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.459.pdf
Software:
 2020.acl-main.459.Software.zip
Video:
 http://slideslive.com/38928758
Data
CRD3