NarraSum: A Large-Scale Dataset for Abstractive Narrative Summarization

Chao Zhao, Faeze Brahman, Kaiqiang Song, Wenlin Yao, Dian Yu, Snigdha Chaturvedi


Abstract
Narrative summarization aims to produce a distilled version of a narrative to describe its most salient events and characters. Writing a summary for a narrative is challenging as it requires an understanding of event causality and character behaviors. To encourage research in this direction, we propose NarraSum, a large-scale narrative summarization dataset. It contains 122K narratives, which are collected from the synopses of movies and TV episodes with diverse genres, and their corresponding abstractive summaries. Experiments show that there is a large performance gap between humans and the state-of-the-art summarization models on NarraSum. We hope that this dataset will promote future research in summarization, as well as broader studies of natural language understanding and generation. The dataset is available at https://github.com/zhaochaocs/narrasum.
Anthology ID:
2022.findings-emnlp.14
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2022
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
182–197
Language:
URL:
https://aclanthology.org/2022.findings-emnlp.14
DOI:
10.18653/v1/2022.findings-emnlp.14
Bibkey:
Cite (ACL):
Chao Zhao, Faeze Brahman, Kaiqiang Song, Wenlin Yao, Dian Yu, and Snigdha Chaturvedi. 2022. NarraSum: A Large-Scale Dataset for Abstractive Narrative Summarization. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 182–197, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
NarraSum: A Large-Scale Dataset for Abstractive Narrative Summarization (Zhao et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-emnlp.14.pdf