Heterogeneous Graph Neural Networks for Extractive Document Summarization

Danqing Wang, Pengfei Liu, Yining Zheng, Xipeng Qiu, Xuanjing Huang


Abstract
As a crucial step in extractive document summarization, learning cross-sentence relations has been explored by a plethora of approaches. An intuitive way is to put them in the graph-based neural network, which has a more complex structure for capturing inter-sentence relationships. In this paper, we present a heterogeneous graph-based neural network for extractive summarization (HETERSUMGRAPH), which contains semantic nodes of different granularity levels apart from sentences. These additional nodes act as the intermediary between sentences and enrich the cross-sentence relations. Besides, our graph structure is flexible in natural extension from a single-document setting to multi-document via introducing document nodes. To our knowledge, we are the first one to introduce different types of nodes into graph-based neural networks for extractive document summarization and perform a comprehensive qualitative analysis to investigate their benefits. The code will be released on Github.
Anthology ID:
2020.acl-main.553
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6209–6219
Language:
URL:
https://aclanthology.org/2020.acl-main.553
DOI:
10.18653/v1/2020.acl-main.553
Bibkey:
Cite (ACL):
Danqing Wang, Pengfei Liu, Yining Zheng, Xipeng Qiu, and Xuanjing Huang. 2020. Heterogeneous Graph Neural Networks for Extractive Document Summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6209–6219, Online. Association for Computational Linguistics.
Cite (Informal):
Heterogeneous Graph Neural Networks for Extractive Document Summarization (Wang et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.553.pdf
Video:
 http://slideslive.com/38929003
Code
 brxx122/HeterSUMGraph
Data
CNN/Daily MailMulti-NewsNew York Times Annotated Corpus