Enhancing Multi-Document Summarization with Cross-Document Graph-based Information Extraction

Zixuan Zhang, Heba Elfardy, Markus Dreyer, Kevin Small, Heng Ji, Mohit Bansal


Abstract
Information extraction (IE) and summarization are closely related, both tasked with presenting a subset of the information contained in a natural language text. However, while IE extracts structural representations, summarization aims to abstract the most salient information into a generated text summary – thus potentially encountering the technical limitations of current text generation methods (e.g., hallucination). To mitigate this risk, this work uses structured IE graphs to enhance the abstractive summarization task. Specifically, we focus on improving Multi-Document Summarization (MDS) performance by using cross-document IE output, incorporating two novel components: (1) the use of auxiliary entity and event recognition systems to focus the summary generation model; (2) incorporating an alignment loss between IE nodes and their text spans to reduce inconsistencies between the IE graphs and text representations. Operationally, both the IE nodes and corresponding text spans are projected into the same embedding space and pairwise distance is minimized. Experimental results on multiple MDS benchmarks show that summaries generated by our model are more factually consistent with the source documents than baseline models while maintaining the same level of abstractiveness.
Anthology ID:
2023.eacl-main.124
Volume:
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
Month:
May
Year:
2023
Address:
Dubrovnik, Croatia
Editors:
Andreas Vlachos, Isabelle Augenstein
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1696–1707
Language:
URL:
https://aclanthology.org/2023.eacl-main.124
DOI:
10.18653/v1/2023.eacl-main.124
Bibkey:
Cite (ACL):
Zixuan Zhang, Heba Elfardy, Markus Dreyer, Kevin Small, Heng Ji, and Mohit Bansal. 2023. Enhancing Multi-Document Summarization with Cross-Document Graph-based Information Extraction. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 1696–1707, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):
Enhancing Multi-Document Summarization with Cross-Document Graph-based Information Extraction (Zhang et al., EACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.eacl-main.124.pdf
Video:
 https://aclanthology.org/2023.eacl-main.124.mp4