Summary Cloze: A New Task for Content Selection in Topic-Focused Summarization

Daniel Deutsch, Dan Roth


Abstract
A key challenge in topic-focused summarization is determining what information should be included in the summary, a problem known as content selection. In this work, we propose a new method for studying content selection in topic-focused summarization called the summary cloze task. The goal of the summary cloze task is to generate the next sentence of a summary conditioned on the beginning of the summary, a topic, and a reference document(s). The main challenge is deciding what information in the references is relevant to the topic and partial summary and should be included in the summary. Although the cloze task does not address all aspects of the traditional summarization problem, the more narrow scope of the task allows us to collect a large-scale datset of nearly 500k summary cloze instances from Wikipedia. We report experimental results on this new dataset using various extractive models and a two-step abstractive model that first extractively selects a small number of sentences and then abstractively summarizes them. Our results show that the topic and partial summary help the models identify relevant content, but the task remains a significant challenge.
Anthology ID:
D19-1386
Volume:
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Month:
November
Year:
2019
Address:
Hong Kong, China
Editors:
Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan
Venues:
EMNLP | IJCNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
3720–3729
Language:
URL:
https://aclanthology.org/D19-1386
DOI:
10.18653/v1/D19-1386
Bibkey:
Cite (ACL):
Daniel Deutsch and Dan Roth. 2019. Summary Cloze: A New Task for Content Selection in Topic-Focused Summarization. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3720–3729, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
Summary Cloze: A New Task for Content Selection in Topic-Focused Summarization (Deutsch & Roth, EMNLP-IJCNLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/D19-1386.pdf
Attachment:
 D19-1386.Attachment.zip