OTExtSum: Extractive Text Summarisation with Optimal Transport

Peggy Tang; Kun Hu; Rui Yan; Lei Zhang; Junbin Gao; Zhiyong Wang

doi:10.18653/v1/2022.findings-naacl.85

OTExtSum: Extractive Text Summarisation with Optimal Transport

Peggy Tang, Kun Hu, Rui Yan, Lei Zhang, Junbin Gao, Zhiyong Wang

Abstract

Extractive text summarisation aims to select salient sentences from a document to form a short yet informative summary. While learning-based methods have achieved promising results, they have several limitations, such as dependence on expensive training and lack of interpretability. Therefore, in this paper, we propose a novel non-learning-based method by for the first time formulating text summarisation as an Optimal Transport (OT) problem, namely Optimal Transport Extractive Summariser (OTExtSum). Optimal sentence extraction is conceptualised as obtaining an optimal summary that minimises the transportation cost to a given document regarding their semantic distributions. Such a cost is defined by the Wasserstein distance and used to measure the summary’s semantic coverage of the original document. Comprehensive experiments on four challenging and widely used datasets - MultiNews, PubMed, BillSum, and CNN/DM demonstrate that our proposed method outperforms the state-of-the-art non-learning-based methods and several recent learning-based methods in terms of the ROUGE metric.

Anthology ID:: 2022.findings-naacl.85
Volume:: Findings of the Association for Computational Linguistics: NAACL 2022
Month:: July
Year:: 2022
Address:: Seattle, United States
Editors:: Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1128–1141
Language:
URL:: https://aclanthology.org/2022.findings-naacl.85/
DOI:: 10.18653/v1/2022.findings-naacl.85
Bibkey:
Cite (ACL):: Peggy Tang, Kun Hu, Rui Yan, Lei Zhang, Junbin Gao, and Zhiyong Wang. 2022. OTExtSum: Extractive Text Summarisation with Optimal Transport. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 1128–1141, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):: OTExtSum: Extractive Text Summarisation with Optimal Transport (Tang et al., Findings 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.findings-naacl.85.pdf
Software:: 2022.findings-naacl.85.software.zip
Video:: https://aclanthology.org/2022.findings-naacl.85.mp4

PDF Cite Search Software Video Fix data