Beyond Reproduction: A Paired-Task Framework for Assessing LLM Comprehension and Creativity in Literary Translation

Ran Zhang; Steffen Eger; Arda Tezcan; Wei Zhao; Simone Paolo Ponzetto; Lieve Macken

Beyond Reproduction: A Paired-Task Framework for Assessing LLM Comprehension and Creativity in Literary Translation

Ran Zhang, Steffen Eger, Arda Tezcan, Wei Zhao, Simone Paolo Ponzetto, Lieve Macken

Abstract

Large language models (LLMs) are increasingly used for creative tasks such as literary translation. Yet translational creativity remains underexplored and is rarely evaluated at scale, while source-text comprehension is typically studied in isolation, despite the fact that, in professional translation, comprehension and creativity are tightly intertwined. We address these gaps with a paired-task framework applied to literary excerpts from 11 books. Task 1 assesses source-text comprehension, and Task 2 evaluates translational creativity through Units of Creative Potential (UCPs), such as metaphors and wordplay. Using a scalable evaluation setup that combines expert human annotations with UCP-based automatic scoring, we benchmark 23 models and four creativity-oriented prompts. Our findings show that strong comprehension does not translate into human-level creativity: models often produce literal or contextually inappropriate renderings, with particularly large gaps for the more distant English–Chinese language pair. Creativity-oriented prompts yield only modest gains, and only one model, Mistral-Large, comes close to human-level creativity (0.167 vs. 0.246). Across all model–prompt combinations, only three exceed a creativity score of 0.1, while the rest remain at or near zero.

Anthology ID:: 2026.findings-acl.2030
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 40862–40884
Language:
URL:: https://aclanthology.org/2026.findings-acl.2030/
DOI:
Bibkey:
Cite (ACL):: Ran Zhang, Steffen Eger, Arda Tezcan, Wei Zhao, Simone Paolo Ponzetto, and Lieve Macken. 2026. Beyond Reproduction: A Paired-Task Framework for Assessing LLM Comprehension and Creativity in Literary Translation. In Findings of the Association for Computational Linguistics: ACL 2026, pages 40862–40884, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Beyond Reproduction: A Paired-Task Framework for Assessing LLM Comprehension and Creativity in Literary Translation (Zhang et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.2030.pdf
Checklist:: 2026.findings-acl.2030.checklist.pdf

PDF Cite Search Checklist Fix data