Multi Document Summarization Evaluation in the Presence of Damaging Content

Avshalom Manevich, David Carmel, Nachshon Cohen, Elad Kravi, Ori Shapira


Abstract
In the Multi-document summarization (MDS) task, a summary is produced for a given set of documents. A recent line of research introduced the concept of damaging documents, denoting documents that should not be exposed to readers due to various reasons. In the presence of damaging documents, a summarizer is ideally expected to exclude damaging content in its output. Existing metrics evaluate a summary based on aspects such as relevance and consistency with the source documents. We propose to additionally measure the ability of MDS systems to properly handle damaging documents in their input set. To that end, we offer two novel metrics based on lexical similarity and language model likelihood. A set of experiments demonstrates the effectiveness of our metrics in measuring the ability of MDS systems to summarize a set of documents while eliminating damaging content from their summaries.
Anthology ID:
2023.findings-emnlp.1
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–12
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.1
DOI:
10.18653/v1/2023.findings-emnlp.1
Bibkey:
Cite (ACL):
Avshalom Manevich, David Carmel, Nachshon Cohen, Elad Kravi, and Ori Shapira. 2023. Multi Document Summarization Evaluation in the Presence of Damaging Content. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 1–12, Singapore. Association for Computational Linguistics.
Cite (Informal):
Multi Document Summarization Evaluation in the Presence of Damaging Content (Manevich et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.1.pdf