Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization

Shuming Ma, Xu Sun, Junyang Lin, Houfeng Wang


Abstract
Most of the current abstractive text summarization models are based on the sequence-to-sequence model (Seq2Seq). The source content of social media is long and noisy, so it is difficult for Seq2Seq to learn an accurate semantic representation. Compared with the source content, the annotated summary is short and well written. Moreover, it shares the same meaning as the source content. In this work, we supervise the learning of the representation of the source content with that of the summary. In implementation, we regard a summary autoencoder as an assistant supervisor of Seq2Seq. Following previous work, we evaluate our model on a popular Chinese social media dataset. Experimental results show that our model achieves the state-of-the-art performances on the benchmark dataset.
Anthology ID:
P18-2115
Volume:
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Iryna Gurevych, Yusuke Miyao
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
725–731
Language:
URL:
https://aclanthology.org/P18-2115
DOI:
10.18653/v1/P18-2115
Bibkey:
Cite (ACL):
Shuming Ma, Xu Sun, Junyang Lin, and Houfeng Wang. 2018. Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 725–731, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization (Ma et al., ACL 2018)
Copy Citation:
PDF:
https://aclanthology.org/P18-2115.pdf
Video:
 https://aclanthology.org/P18-2115.mp4
Code
 lancopku/superAE
Data
LCSTS