Graph-Augmented Open-Domain Multi-Document Summarization

Xiaoping Shen, Yekun Chai


Abstract
In the open-domain multi-document summarization (ODMDS) task, retrieving relevant documents from large repositories and generating coherent summaries are crucial. However, existing methods often treat retrieval and summarization as separate tasks, neglecting the relationships among documents. To address these limitations, we propose an integrated retrieval-summarization framework that captures global document relationships through graph-based clustering, guiding the re-ranking of retrieved documents. This cluster-level thematic information is then used to guide large language models (LLMs) in refining the retrieved documents and generating more accurate, coherent summaries. Experimental results on the ODSUM benchmark demonstrate that our method significantly improves retrieval accuracy and produces summaries that surpass those derived from the oracle documents. These findings highlight the potential of our framework to improve both retrieval and summarization tasks in ODMDS.
Anthology ID:
2025.coling-industry.27
Volume:
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert, Kareem Darwish, Apoorv Agarwal
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
318–330
Language:
URL:
https://aclanthology.org/2025.coling-industry.27/
DOI:
Bibkey:
Cite (ACL):
Xiaoping Shen and Yekun Chai. 2025. Graph-Augmented Open-Domain Multi-Document Summarization. In Proceedings of the 31st International Conference on Computational Linguistics: Industry Track, pages 318–330, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Graph-Augmented Open-Domain Multi-Document Summarization (Shen & Chai, COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-industry.27.pdf