Between the Drafts: An Evaluation Framework for Identifying Quality Improvement and Stylistic Differences in Scientific Texts

Danqing Chen; Ingo Weber; Felix Dietrich

Between the Drafts: An Evaluation Framework for Identifying Quality Improvement and Stylistic Differences in Scientific Texts

Danqing Chen, Ingo Weber, Felix Dietrich

Abstract

This study explores the potential of a lightweight, open-source Large Language Model (LLM), demonstrating how its integration with Retrieval-Augmented Generation (RAG) can support cost-effective evaluation of revision quality and writing style differentiation. By retrieving reference documents from a carefully chosen and constructed corpus of peer-reviewed conference proceedings, our framework leverages few-shot in-context learning to track manuscript revisions and venue-specific writing styles. We demonstrate that the LLM-based evaluation aligns closely with human revision histories—consistently recognizing quality improvements across revision stages and distinguishing writing styles associated with different conference venues. These findings highlight how a carefully designed evaluation framework, integrated with adequate, representative data, can advance automated assessment of scientific writing.

Anthology ID:: 2025.eval4nlp-1.6
Volume:: Proceedings of the 5th Workshop on Evaluation and Comparison of NLP Systems
Month:: December
Year:: 2025
Address:: Mumbai, India
Editors:: Mousumi Akter, Tahiya Chowdhury, Steffen Eger, Christoph Leiter, Juri Opitz, Erion Çano
Venues:: Eval4NLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 66–84
Language:
URL:: https://aclanthology.org/2025.eval4nlp-1.6/
DOI:
Bibkey:
Cite (ACL):: Danqing Chen, Ingo Weber, and Felix Dietrich. 2025. Between the Drafts: An Evaluation Framework for Identifying Quality Improvement and Stylistic Differences in Scientific Texts. In Proceedings of the 5th Workshop on Evaluation and Comparison of NLP Systems, pages 66–84, Mumbai, India. Association for Computational Linguistics.
Cite (Informal):: Between the Drafts: An Evaluation Framework for Identifying Quality Improvement and Stylistic Differences in Scientific Texts (Chen et al., Eval4NLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.eval4nlp-1.6.pdf

PDF Cite Search Fix data