Mirage: A Diagnostic Framework for Evaluating the Realism of Synthetic Contact Center Dialogue Generation

Rishikesh Devanathan; Varun Nathan; Ayush Kumar

doi:10.18653/v1/2026.findings-acl.1261

Mirage: A Diagnostic Framework for Evaluating the Realism of Synthetic Contact Center Dialogue Generation

Rishikesh Devanathan, Varun Nathan, Ayush Kumar

Abstract

Synthetic data is increasingly critical for contact centers, where privacy constraints and data scarcity limit the availability of real conversations. However, generating synthetic dialogues that are realistic and useful for downstream applications remains challenging. In this work, we benchmark multiple generation strategies guided by structured supervision on call attributes (Intent Summaries, Topic Flows, and Quality Assurance (QA) Forms) across multiple languages. To test downstream utility, we evaluate synthetic transcripts on an automated quality assurance (AutoQA) task, finding that prompts optimized on real transcripts consistently outperform those optimized on synthetic transcripts. These results suggest that current synthetic transcripts fall short in capturing the full realism of real agent–customer interactions. To highlight these downstream gaps, we introduce a diagnostic evaluation framework comprising 17 metrics across four dimensions: (1) Emotional and Sentiment Arcs, (2) Linguistic Complexity, (3) Interaction Style, and (4) Conversational Properties. Our analysis shows that even with structured supervision, current generation strategies exhibit measurable deficiencies in sentiment fidelity, disfluency modeling, behavioral variation, and conversational realism. Together, these results highlight the importance of diagnostic, metric-driven evaluation for synthetic conversation generation intended for downstream applications.

Anthology ID:: 2026.findings-acl.1261
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 25187–25227
Language:
URL:: https://aclanthology.org/2026.findings-acl.1261/
DOI:: 10.18653/v1/2026.findings-acl.1261
Bibkey:
Cite (ACL):: Rishikesh Devanathan, Varun Nathan, and Ayush Kumar. 2026. Mirage: A Diagnostic Framework for Evaluating the Realism of Synthetic Contact Center Dialogue Generation. In Findings of the Association for Computational Linguistics: ACL 2026, pages 25187–25227, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Mirage: A Diagnostic Framework for Evaluating the Realism of Synthetic Contact Center Dialogue Generation (Devanathan et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.1261.pdf
Checklist:: 2026.findings-acl.1261.checklist.pdf

PDF Cite Search Checklist Fix data