Learning to Summarize from LLM-generated Feedback

Hwanjun Song; Taewon Yun; Yuho Lee; Jihwan Oh; Gihun Lee; Jason Cai; Hang Su

doi:10.18653/v1/2025.naacl-long.38

Learning to Summarize from LLM-generated Feedback

Hwanjun Song, Taewon Yun, Yuho Lee, Jihwan Oh, Gihun Lee, Jason Cai, Hang Su

Abstract

Developing effective text summarizers remains a challenge due to issues like hallucinations, key information omissions, and verbosity in LLM-generated summaries. This work explores using LLM-generated feedback to improve summary quality by aligning the summaries with human preferences for faithfulness, completeness, and conciseness. We introduce FeedSum, a large-scale dataset containing multi-dimensional LLM feedback on summaries of varying quality across diverse domains. Our experiments show how feedback quality, dimensionality, and granularity influence preference learning, revealing that high-quality, multi-dimensional, fine-grained feedback significantly improves summary generation. We also compare two methods for using this feedback: supervised fine-tuning and direct preference optimization. Finally, we introduce SummLlama3-8b, a model that outperforms the nearly 10x larger Llama3-70b-instruct in generating human-preferred summaries, demonstrating that smaller models can achieve superior performance with appropriate training. The full dataset and SummLlama3-8B model are available at https://huggingface.co/datasets/DISLab/FeedSum and https://huggingface.co/DISLab/SummLlama3-8B.

Anthology ID:: 2025.naacl-long.38
Volume:: Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 835–857
Language:
URL:: https://aclanthology.org/2025.naacl-long.38/
DOI:: 10.18653/v1/2025.naacl-long.38
Bibkey:
Cite (ACL):: Hwanjun Song, Taewon Yun, Yuho Lee, Jihwan Oh, Gihun Lee, Jason Cai, and Hang Su. 2025. Learning to Summarize from LLM-generated Feedback. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 835–857, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: Learning to Summarize from LLM-generated Feedback (Song et al., NAACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.naacl-long.38.pdf

PDF Cite Search Fix data