LLMs Behind the Scenes: Enabling Narrative Scene Illustration

Melissa Roemmele, John Joon Young Chung, Taewook Kim, Yuqian Sun, Alex Calderwood, Max Kreminski


Abstract
Generative AI has established the opportunity to readily transform content from one medium to another. This capability is especially powerful for storytelling, where visual illustrations can illuminate a story originally expressed in text. In this paper, we focus on the task of narrative scene illustration, which involves automatically generating an image depicting a scene in a story. Motivated by recent progress on text-to-image models, we consider a pipeline that uses LLMs as an interface for prompting text-to-image models to generate scene illustrations given raw story text. We apply variations of this pipeline to a prominent story corpus in order to synthesize illustrations for scenes in these stories. We conduct a human annotation task to obtain pairwise quality judgments for these illustrations. The outcome of this process is the SceneIllustrations dataset, which we release as a new resource for future work on cross-modal narrative transformation. Through our analysis of this dataset and experiments modeling illustration quality, we demonstrate that LLMs can effectively verbalize scene knowledge implicitly evoked by story text. Moreover, this capability is impactful for generating and evaluating illustrations.
Anthology ID:
2025.emnlp-main.124
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2439–2457
Language:
URL:
https://aclanthology.org/2025.emnlp-main.124/
DOI:
Bibkey:
Cite (ACL):
Melissa Roemmele, John Joon Young Chung, Taewook Kim, Yuqian Sun, Alex Calderwood, and Max Kreminski. 2025. LLMs Behind the Scenes: Enabling Narrative Scene Illustration. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 2439–2457, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
LLMs Behind the Scenes: Enabling Narrative Scene Illustration (Roemmele et al., EMNLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.emnlp-main.124.pdf
Checklist:
 2025.emnlp-main.124.checklist.pdf