Exploring Content Selection in Summarization of Novel Chapters

Faisal Ladhak, Bryan Li, Yaser Al-Onaizan, Kathleen McKeown


Abstract
We present a new summarization task, generating summaries of novel chapters using summary/chapter pairs from online study guides. This is a harder task than the news summarization task, given the chapter length as well as the extreme paraphrasing and generalization found in the summaries. We focus on extractive summarization, which requires the creation of a gold-standard set of extractive summaries. We present a new metric for aligning reference summary sentences with chapter sentences to create gold extracts and also experiment with different alignment methods. Our experiments demonstrate significant improvement over prior alignment approaches for our task as shown through automatic metrics and a crowd-sourced pyramid analysis.
Anthology ID:
2020.acl-main.453
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5043–5054
Language:
URL:
https://aclanthology.org/2020.acl-main.453
DOI:
10.18653/v1/2020.acl-main.453
Bibkey:
Cite (ACL):
Faisal Ladhak, Bryan Li, Yaser Al-Onaizan, and Kathleen McKeown. 2020. Exploring Content Selection in Summarization of Novel Chapters. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5043–5054, Online. Association for Computational Linguistics.
Cite (Informal):
Exploring Content Selection in Summarization of Novel Chapters (Ladhak et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.453.pdf
Video:
 http://slideslive.com/38929346
Code
 manestay/novel-chapter-dataset