A Comparative Evaluation of End-to-End and Pipeline Approaches for Summarisation

Fahime Same; Saad Mahamood; Srinivas Ramesh Kamath

A Comparative Evaluation of End-to-End and Pipeline Approaches for Summarisation

Fahime Same, Saad Mahamood, Srinivas Ramesh Kamath

Abstract

We describe and evaluate two different architectures for creating book highlights from unstructured data. Given the prevalence of large language models, we examine whether a pipeline-based approach with intermediate steps for text generation is still necessary and whether it continues to offer any benefits over an end-to-end approach. Our comparative evaluations using LLM-as-a-judge across multiple models with different parameter sizes and generation scenarios show that highlights generated by the end-to-end approach are preferred. However, there is a slight but consistent increase in faithfulness for the pipeline-generated highlights when generating at a thematic level. Additionally, our analysis across multiple models shows that while larger models are more faithful, the degree of faithfulness increases when they are used with a pipeline architecture. The findings from our work indicate that whilst there is comparability between the two approaches, the greater faithfulness, controllability, and observability of pipeline-based approaches offer tangible benefits in applied settings.

Anthology ID:: 2026.retroeval-main.6
Volume:: Proceedings of the 1st Symposium on Natural Language Generation Evaluations
Month:: June
Year:: 2026
Address:: Aberdeen, United Kingdom
Editors:: Saad Mahamood, David M. Howcroft, Kees van Deemter, Simone Balloccu, Adarsa Sivaprasad, Barkavi Sundararajan, Alberto Bugarín Diz, Jose María Alonso-Moral
Venue:: RetroEval
SIG:: SIGGEN
Publisher:: Association for Computational Linguistics
Note:
Pages:: 39–52
Language:
URL:: https://aclanthology.org/2026.retroeval-main.6/
DOI:
Bibkey:
Cite (ACL):: Fahime Same, Saad Mahamood, and Srinivas Ramesh Kamath. 2026. A Comparative Evaluation of End-to-End and Pipeline Approaches for Summarisation. In Proceedings of the 1st Symposium on Natural Language Generation Evaluations, pages 39–52, Aberdeen, United Kingdom. Association for Computational Linguistics.
Cite (Informal):: A Comparative Evaluation of End-to-End and Pipeline Approaches for Summarisation (Same et al., RetroEval 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.retroeval-main.6.pdf

PDF Cite Search Fix data