Source Identification in Abstractive Summarization

Yoshi Suhara, Dimitris Alikaniotis


Abstract
Neural abstractive summarization models make summaries in an end-to-end manner, and little is known about how the source information is actually converted into summaries. In this paper, we define input sentences that contain essential information in the generated summary as source sentences and study how abstractive summaries are made by analyzing the source sentences. To this end, we annotate source sentences for reference summaries and system summaries generated by PEGASUS on document-summary pairs sampled from the CNN/DailyMail and XSum datasets. We also formulate automatic source sentence detection and compare multiple methods to establish a strong baseline for the task. Experimental results show that the perplexity-based method performs well in highly abstractive settings, while similarity-based methods perform robustly in relatively extractive settings.
Anthology ID:
2024.eacl-short.20
Volume:
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
March
Year:
2024
Address:
St. Julian’s, Malta
Editors:
Yvette Graham, Matthew Purver
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
212–224
Language:
URL:
https://aclanthology.org/2024.eacl-short.20
DOI:
Bibkey:
Cite (ACL):
Yoshi Suhara and Dimitris Alikaniotis. 2024. Source Identification in Abstractive Summarization. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers), pages 212–224, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):
Source Identification in Abstractive Summarization (Suhara & Alikaniotis, EACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.eacl-short.20.pdf
Software:
 2024.eacl-short.20.software.zip
Note:
 2024.eacl-short.20.note.zip
Video:
 https://aclanthology.org/2024.eacl-short.20.mp4