Learning to Encode Text as Human-Readable Summaries using Generative Adversarial Networks

Yaushian Wang, Hung-Yi Lee


Abstract
Auto-encoders compress input data into a latent-space representation and reconstruct the original data from the representation. This latent representation is not easily interpreted by humans. In this paper, we propose training an auto-encoder that encodes input text into human-readable sentences, and unpaired abstractive summarization is thereby achieved. The auto-encoder is composed of a generator and a reconstructor. The generator encodes the input text into a shorter word sequence, and the reconstructor recovers the generator input from the generator output. To make the generator output human-readable, a discriminator restricts the output of the generator to resemble human-written sentences. By taking the generator output as the summary of the input text, abstractive summarization is achieved without document-summary pairs as training data. Promising results are shown on both English and Chinese corpora.
Anthology ID:
D18-1451
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
4187–4195
Language:
URL:
https://aclanthology.org/D18-1451
DOI:
10.18653/v1/D18-1451
Bibkey:
Cite (ACL):
Yaushian Wang and Hung-Yi Lee. 2018. Learning to Encode Text as Human-Readable Summaries using Generative Adversarial Networks. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4187–4195, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Learning to Encode Text as Human-Readable Summaries using Generative Adversarial Networks (Wang & Lee, EMNLP 2018)
Copy Citation:
PDF:
https://aclanthology.org/D18-1451.pdf
Attachment:
 D18-1451.Attachment.pdf