Tianwen Li

2025

eRevise+RF: A Writing Evaluation System for Assessing Student Essay Revisions and Providing Formative Feedback
Zhexiong Liu | Diane Litman | Elaine L Wang | Tianwen Li | Mason Gobat | Lindsay Clare Matsumura | Richard Correnti
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (System Demonstrations)

The ability to revise essays in response to feedback is important for students’ writing success. An automated writing evaluation (AWE) system that supports students in revising their essays is thus essential. We present eRevise+RF, an enhanced AWE system for assessing student essay revisions (e.g., changes made to an essay to improve its quality in response to essay feedback) and providing revision feedback. We deployed the system with 6 teachers and 406 students across 3 schools in Pennsylvania and Louisiana. The results confirmed its effectiveness in (1) assessing student essays in terms of evidence usage, (2) extracting evidence and reasoning revisions across essays, and (3) determining revision success in responding to feedback. The evaluation also suggested eRevise+RF is a helpful system for young students to improve their argumentative writing skills through revision and formative feedback.

pdf bib abs

Chain-of-Thought Prompting for Automated Evaluation of Revision Patterns in Young Student Writing
Tianwen Li | Michelle Hong | Lindsay Clare Matsumura | Elaine Lin Wang | Diane Litman | Zhexiong Liu | Richard Correnti
Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Works in Progress

This study explores the use of ChatGPT-4.1 as a formative assessment tool for identifying revision patterns in young adolescents’ argumentative writing. ChatGPT-4.1 shows moderate agreement with human coders on identifying evidence-related revision patterns and fair agreement on explanation-related ones. Implications for LLM-assisted formative assessment of young adolescent writing are discussed.

2024

pdf bib abs

Using Large Language Models to Assess Young Students’ Writing Revisions
Tianwen Li | Zhexiong Liu | Lindsay Matsumura | Elaine Wang | Diane Litman | Richard Correnti
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)

Although effective revision is the crucial component of writing instruction, few automated writing evaluation (AWE) systems specifically focus on the quality of the revisions students undertake. In this study we investigate the use of a large language model (GPT-4) with Chain-of-Thought (CoT) prompting for assessing the quality of young students’ essay revisions aligned with the automated feedback messages they received. Results indicate that GPT-4 has significant potential for evaluating revision quality, particularly when detailed rubrics are included that describe common revision patterns shown by young writers. However, the addition of CoT prompting did not significantly improve performance. Further examination of GPT-4’s scoring performance across various levels of student writing proficiency revealed variable agreement with human ratings. The implications for improving AWE systems focusing on young students are discussed.

Co-authors

Venues

Fix author