SAUCE: Summary Analysis Using Conversation Entailment

Man-Ling Sung; Hemanth Kandula; Jeff Ma; William Hartmann; Matthew Snover

SAUCE: Summary Analysis Using Conversation Entailment

Man-Ling Sung, Hemanth Kandula, Jeff Ma, William Hartmann, Matthew Snover

Abstract

With the growing need for evaluating Large Language Models (LLMs) and their applications to speech, challenges persist in summarizing and evaluating conversations that lack a clear end goal. We introduce SAUCE – a reference-free, fact-based evaluation pipeline for cross-lingual conversational speech summarization. It measures the accuracy and the fact coverage of a summary through the entailment between conversation and text. We compare SAUCE against several popular summarization metrics and demonstrate the effectiveness of capturing information loss due to transcription and translation error and identifying broken summaries. Crucially, unlike black-box LLM evaluators or dense embedding metrics, SAUCE is inherently explainable: it maps summary scores to discrete, verifiable facts, allowing users to pinpoint exact hallucinations or omissions. We illustrate how this interpretability helps developers systematically profile LLM behaviors and gives end-users an actionable tool to verify summary accuracy in noisy, real-world conditions. Preliminary investigations show SAUCE strongly align with human judgment.

Anthology ID:: 2026.gem-main.34
Volume:: Proceedings of the Fifth Workshop on Generation, Evaluation and Metrics (GEM)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Simon Mille, Sebastian Gehrmann, Patrícia Schmidtová, Ondřej Dušek, Marzieh Fadaee, Kyle Lo, Enrico Santus, Gabriel Stanovsky
Venues:: GEM | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 364–377
Language:
URL:: https://aclanthology.org/2026.gem-main.34/
DOI:
Bibkey:
Cite (ACL):: Man-Ling Sung, Hemanth Kandula, Jeff Ma, William Hartmann, and Matthew Snover. 2026. SAUCE: Summary Analysis Using Conversation Entailment. In Proceedings of the Fifth Workshop on Generation, Evaluation and Metrics (GEM), pages 364–377, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: SAUCE: Summary Analysis Using Conversation Entailment (Sung et al., GEM 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.gem-main.34.pdf

PDF Cite Search Fix data