Lost in Stories: Consistency Bugs in Long Story Generation by LLMs

Junjie Li; Xinrui Guo; Yuhao Wu; Roy Ka-Wei Lee; Hongzhi Li; Yutao Xie

Lost in Stories: Consistency Bugs in Long Story Generation by LLMs

Junjie Li, Xinrui Guo, Yuhao Wu, Roy Ka-Wei Lee, Hongzhi Li, Yutao Xie

Abstract

What happens when a storyteller forgets its own story? Large Language Models (LLMs) can now generate narratives spanning tens of thousands of words, but they often fail to maintain consistency throughout. When generating long-form narratives, these models can contradict their own established facts, character traits, and world rules. Existing story generation benchmarks focus mainly on plot quality and fluency, leaving consistency errors largely unexplored. To address this gap, we present ConStory-Bench, a benchmark designed to evaluate narrative consistency in long-form story generation. It contains 2,000 prompts across four task scenarios and defines a taxonomy of five error categories with 19 fine-grained subtypes. We also develop ConStory-Checker, an automated pipeline that detects contradictions and grounds each judgment in explicit textual evidence. Evaluating a range of LLMs through five research questions, we find that consistency errors show clear tendencies: they are most common in factual and temporal dimensions, tend to appear around the middle of narratives, occur in text segments with higher token-level entropy, and certain error types tend to co-occur. These findings can inform future efforts to improve consistency in long-form narrative generation.

Anthology ID:: 2026.findings-acl.410
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8400–8428
Language:
URL:: https://aclanthology.org/2026.findings-acl.410/
DOI:
Bibkey:
Cite (ACL):: Junjie Li, Xinrui Guo, Yuhao Wu, Roy Ka-Wei Lee, Hongzhi Li, and Yutao Xie. 2026. Lost in Stories: Consistency Bugs in Long Story Generation by LLMs. In Findings of the Association for Computational Linguistics: ACL 2026, pages 8400–8428, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Lost in Stories: Consistency Bugs in Long Story Generation by LLMs (Li et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.410.pdf
Checklist:: 2026.findings-acl.410.checklist.pdf

PDF Cite Search Checklist Fix data