Unsupervised Rewriter for Multi-Sentence Compression

Yang Zhao, Xiaoyu Shen, Wei Bi, Akiko Aizawa


Abstract
Multi-sentence compression (MSC) aims to generate a grammatical but reduced compression from multiple input sentences while retaining their key information. Previous dominating approach for MSC is the extraction-based word graph approach. A few variants further leveraged lexical substitution to yield more abstractive compression. However, two limitations exist. First, the word graph approach that simply concatenates fragments from multiple sentences may yield non-fluent or ungrammatical compression. Second, lexical substitution is often inappropriate without the consideration of context information. To tackle the above-mentioned issues, we present a neural rewriter for multi-sentence compression that does not need any parallel corpus. Empirical studies have shown that our approach achieves comparable results upon automatic evaluation and improves the grammaticality of compression based on human evaluation. A parallel corpus with more than 140,000 (sentence group, compression) pairs is also constructed as a by-product for future research.
Anthology ID:
P19-1216
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2235–2240
Language:
URL:
https://aclanthology.org/P19-1216
DOI:
10.18653/v1/P19-1216
Bibkey:
Cite (ACL):
Yang Zhao, Xiaoyu Shen, Wei Bi, and Akiko Aizawa. 2019. Unsupervised Rewriter for Multi-Sentence Compression. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2235–2240, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Unsupervised Rewriter for Multi-Sentence Compression (Zhao et al., ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-1216.pdf
Supplementary:
 P19-1216.Supplementary.pdf