Source-summary Entity Aggregation in Abstractive Summarization

José Ángel González, Annie Louis, Jackie Chi Kit Cheung


Abstract
In a text, entities mentioned earlier can be referred to in later discourse by a more general description. For example, Celine Dion and Justin Bieber can be referred to by Canadian singers or celebrities. In this work, we study this phenomenon in the context of summarization, where entities from a source text are generalized in the summary. We call such instances source-summary entity aggregations. We categorize these aggregations into two types and analyze them in the Cnn/Dailymail corpus, showing that they are reasonably frequent. We then examine how well three state-of-the-art summarization systems can generate such aggregations within summaries. We also develop techniques to encourage them to generate more aggregations. Our results show that there is significant room for improvement in producing semantically correct aggregations.
Anthology ID:
2022.coling-1.526
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
6019–6034
Language:
URL:
https://aclanthology.org/2022.coling-1.526
DOI:
Bibkey:
Cite (ACL):
José Ángel González, Annie Louis, and Jackie Chi Kit Cheung. 2022. Source-summary Entity Aggregation in Abstractive Summarization. In Proceedings of the 29th International Conference on Computational Linguistics, pages 6019–6034, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
Source-summary Entity Aggregation in Abstractive Summarization (González et al., COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.526.pdf