STRASS: A Light and Effective Method for Extractive Summarization Based on Sentence Embeddings

Léo Bouscarrat, Antoine Bonnefoy, Thomas Peel, Cécile Pereira


Abstract
This paper introduces STRASS: Summarization by TRAnsformation Selection and Scoring. It is an extractive text summarization method which leverages the semantic information in existing sentence embedding spaces. Our method creates an extractive summary by selecting the sentences with the closest embeddings to the document embedding. The model earns a transformation of the document embedding to minimize the similarity between the extractive summary and the ground truth summary. As the transformation is only composed of a dense layer, the training can be done on CPU, therefore, inexpensive. Moreover, inference time is short and linear according to the number of sentences. As a second contribution, we introduce the French CASS dataset, composed of judgments from the French Court of cassation and their corresponding summaries. On this dataset, our results show that our method performs similarly to the state of the art extractive methods with effective training and inferring time.
Anthology ID:
P19-2034
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Fernando Alva-Manchego, Eunsol Choi, Daniel Khashabi
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
243–252
Language:
URL:
https://aclanthology.org/P19-2034/
DOI:
10.18653/v1/P19-2034
Bibkey:
Cite (ACL):
Léo Bouscarrat, Antoine Bonnefoy, Thomas Peel, and Cécile Pereira. 2019. STRASS: A Light and Effective Method for Extractive Summarization Based on Sentence Embeddings. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pages 243–252, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
STRASS: A Light and Effective Method for Extractive Summarization Based on Sentence Embeddings (Bouscarrat et al., ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-2034.pdf
Code
 euranova/CASS-dataset
Data
French CASS dataset