Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization

Lei Xu, Mohammed Asad Karim, Saket Dingliwal, Aparna Elangovan


Abstract
Large language models (LLMs) can generate fluent summaries across domains using prompting techniques, reducing the effort required for summarization applications. However, crafting effective prompts that guide LLMs to generate summaries with the appropriate level of detail and writing style remains a challenge. In this paper, we explore the use of salient information extracted from the source document to enhance summarization prompts. We show that adding keyphrases in prompts can improve ROUGE F1 and recall, making the generated summaries more similar to the reference and more complete. The number of keyphrases can control the precision-recall trade-off. Furthermore, our analysis reveals that incorporating phrase-level salient information is superior to word- or sentence-level. However, the impact on summary faithfulness is not universally positive across LLMs. To enable this approach, we introduce Keyphrase Signal Extractor (SigExt), a lightweight model that can be finetuned to extract salient keyphrases. By using SigExt, we achieve consistent ROUGE improvements across datasets and LLMs without any LLM customization. Our findings provide insights into leveraging salient information in building prompt-based summarization systems.
Anthology ID:
2024.emnlp-industry.4
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:
November
Year:
2024
Address:
Miami, Florida, US
Editors:
Franck Dernoncourt, Daniel Preoţiuc-Pietro, Anastasia Shimorina
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
35–49
Language:
URL:
https://aclanthology.org/2024.emnlp-industry.4
DOI:
Bibkey:
Cite (ACL):
Lei Xu, Mohammed Asad Karim, Saket Dingliwal, and Aparna Elangovan. 2024. Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 35–49, Miami, Florida, US. Association for Computational Linguistics.
Cite (Informal):
Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization (Xu et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-industry.4.pdf