Capturing Classic Authorial Style in Long-Form Story Generation with GRPO Fine-Tuning

Jinlong Liu; Mark Lee; Mohammed Bahja; Venelin Kovatchev

Capturing Classic Authorial Style in Long-Form Story Generation with GRPO Fine-Tuning

Jinlong Liu, Mark G. Lee, Mohammed Bahja, Venelin Kovatchev

Abstract

Evaluating and optimizing authorial style in long-form story generation is challenging because style judgments often rely on subjective human voting, and there is no stable automatic evaluation method. We propose a two-stage pipeline. First, we train a style-similarity judge by fine-tuning a sentence-transformer with authorship-verification supervision, and calibrate its similarity outputs into a bounded [0,1] reward. Second, we use this judge as the primary reward in Group Relative Policy Optimization (GRPO) to fine-tune an 8B story generator for style-conditioned writing, avoiding the accept/reject supervision required by Direct Preference Optimization (DPO). Across four target authors (Mark Twain, Jane Austen, Charles Dickens, Thomas Hardy), the GRPO-trained 8B model achieves higher style scores than open-weight baselines, with an average style score of 0.893 across authors. These results suggest that AV-calibrated reward modeling provides a practical mechanism for controllable long-form style transfer under moderate model size and training budget.

Anthology ID:: 2026.conll-main.31
Volume:: Proceedings of the 30th Conference on Computational Natural Language Learning
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Claire Bonial, Yevgeni Berzak
Venues:: CoNLL | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 526–543
Language:
URL:: https://aclanthology.org/2026.conll-main.31/
DOI:
Bibkey:
Cite (ACL):: Jinlong Liu, Mark G. Lee, Mohammed Bahja, and Venelin Kovatchev. 2026. Capturing Classic Authorial Style in Long-Form Story Generation with GRPO Fine-Tuning. In Proceedings of the 30th Conference on Computational Natural Language Learning, pages 526–543, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: Capturing Classic Authorial Style in Long-Form Story Generation with GRPO Fine-Tuning (Liu et al., CoNLL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.conll-main.31.pdf

PDF Cite Search Fix data