GRETEL: Graph Contrastive Topic Enhanced Language Model for Long Document Extractive Summarization

Qianqian Xie, Jimin Huang, Tulika Saha, Sophia Ananiadou


Abstract
Recently, neural topic models (NTMs) have been incorporated into pre-trained language models (PLMs), to capture the global semantic information for text summarization. However, in these methods, there remain limitations in the way they capture and integrate the global semantic information. In this paper, we propose a novel model, the graph contrastive topic enhanced language model (GRETEL), that incorporates the graph contrastive topic model with the pre-trained language model, to fully leverage both the global and local contextual semantics for long document extractive summarization. To better capture and incorporate the global semantic information into PLMs, the graph contrastive topic model integrates the hierarchical transformer encoder and the graph contrastive learning to fuse the semantic information from the global document context and the gold summary. To this end, GRETEL encourages the model to efficiently extract salient sentences that are topically related to the gold summary, rather than redundant sentences that cover sub-optimal topics. Experimental results on both general domain and biomedical datasets demonstrate that our proposed method outperforms SOTA methods.
Anthology ID:
2022.coling-1.546
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
6259–6269
Language:
URL:
https://aclanthology.org/2022.coling-1.546
DOI:
Bibkey:
Cite (ACL):
Qianqian Xie, Jimin Huang, Tulika Saha, and Sophia Ananiadou. 2022. GRETEL: Graph Contrastive Topic Enhanced Language Model for Long Document Extractive Summarization. In Proceedings of the 29th International Conference on Computational Linguistics, pages 6259–6269, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
GRETEL: Graph Contrastive Topic Enhanced Language Model for Long Document Extractive Summarization (Xie et al., COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.546.pdf
Code
 xashely/gretel_extractive
Data
CORD-19PubmedS2ORC