Generating Abstractive Summaries with Finetuned Language Models

Sebastian Gehrmann, Zachary Ziegler, Alexander Rush


Abstract
Neural abstractive document summarization is commonly approached by models that exhibit a mostly extractive behavior. This behavior is facilitated by a copy-attention which allows models to copy words from a source document. While models in the mostly extractive news summarization domain benefit from this inductive bias, they commonly fail to paraphrase or compress information from the source document. Recent advances in transfer-learning from large pretrained language models give rise to alternative approaches that do not rely on copy-attention and instead learn to generate concise and abstractive summaries. In this paper, as part of the TL;DR challenge, we compare the abstractiveness of summaries from different summarization approaches and show that transfer-learning can be efficiently utilized without any changes to the model architecture. We demonstrate that the approach leads to a higher level of abstraction for a similar performance on the TL;DR challenge tasks, enabling true natural language compression.
Anthology ID:
W19-8665
Volume:
Proceedings of the 12th International Conference on Natural Language Generation
Month:
October–November
Year:
2019
Address:
Tokyo, Japan
Venues:
INLG | WS
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
516–522
Language:
URL:
https://aclanthology.org/W19-8665
DOI:
10.18653/v1/W19-8665
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/W19-8665.pdf