A study of semantic augmentation of word embeddings for extractive summarization

Nikiforos Pittaras, Vangelis Karkaletsis


Abstract
In this study we examine the effect of semantic augmentation approaches on extractive text summarization. Wordnet hypernym relations are used to extract term-frequency concept information, subsequently concatenated to sentence-level representations produced by aggregated deep neural word embeddings. Multiple dimensionality reduction techniques and combination strategies are examined via feature transformation and clustering methods. An experimental evaluation on the MultiLing 2015 MSS dataset illustrates that semantic information can introduce benefits to the extractive summarization process in terms of F1, ROUGE-1 and ROUGE-2 scores, with LSA-based post-processing introducing the largest improvements.
Anthology ID:
W19-8909
Volume:
Proceedings of the Workshop MultiLing 2019: Summarization Across Languages, Genres and Sources
Month:
September
Year:
2019
Address:
Varna, Bulgaria
Editor:
George Giannakopoulos
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
63–72
Language:
URL:
https://aclanthology.org/W19-8909
DOI:
10.26615/978-954-452-058-8_009
Bibkey:
Cite (ACL):
Nikiforos Pittaras and Vangelis Karkaletsis. 2019. A study of semantic augmentation of word embeddings for extractive summarization. In Proceedings of the Workshop MultiLing 2019: Summarization Across Languages, Genres and Sources, pages 63–72, Varna, Bulgaria. INCOMA Ltd..
Cite (Informal):
A study of semantic augmentation of word embeddings for extractive summarization (Pittaras & Karkaletsis, RANLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-8909.pdf