Sparse Optimization for Unsupervised Extractive Summarization of Long Documents with the Frank-Wolfe Algorithm

Alicia Tsai; Laurent El Ghaoui

doi:10.18653/v1/2020.sustainlp-1.8

Sparse Optimization for Unsupervised Extractive Summarization of Long Documents with the Frank-Wolfe Algorithm

Abstract

We address the problem of unsupervised extractive document summarization, especially for long documents. We model the unsupervised problem as a sparse auto-regression one and approximate the resulting combinatorial problem via a convex, norm-constrained problem. We solve it using a dedicated Frank-Wolfe algorithm. To generate a summary with k sentences, the algorithm only needs to execute approximately k iterations, making it very efficient for a long document. We evaluate our approach against two other unsupervised methods using both lexical (standard) ROUGE scores, as well as semantic (embedding-based) ones. Our method achieves better results with both datasets and works especially well when combined with embeddings for highly paraphrased summaries.

Anthology ID:: 2020.sustainlp-1.8
Volume:: Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing
Month:: November
Year:: 2020
Address:: Online
Editors:: Nafise Sadat Moosavi, Angela Fan, Vered Shwartz, Goran Glavaš, Shafiq Joty, Alex Wang, Thomas Wolf
Venue:: sustainlp
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 54–62
Language:
URL:: https://aclanthology.org/2020.sustainlp-1.8/
DOI:: 10.18653/v1/2020.sustainlp-1.8
Bibkey:
Cite (ACL):: Alicia Tsai and Laurent El Ghaoui. 2020. Sparse Optimization for Unsupervised Extractive Summarization of Long Documents with the Frank-Wolfe Algorithm. In Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing, pages 54–62, Online. Association for Computational Linguistics.
Cite (Informal):: Sparse Optimization for Unsupervised Extractive Summarization of Long Documents with the Frank-Wolfe Algorithm (Tsai & El Ghaoui, sustainlp 2020)
Copy Citation:
PDF:: https://aclanthology.org/2020.sustainlp-1.8.pdf
Optionalsupplementarymaterial:: 2020.sustainlp-1.8.OptionalSupplementaryMaterial.pdf
Video:: https://slideslive.com/38939430

PDF Cite Search Optionalsupplementarymaterial Video Fix data