Xinshu Shen


2024

pdf bib
TOREE: Evaluating Topic Relevance of Student Essays for Chinese Primary and Middle School Education
Xinlin Zhuang | Hongyi Wu | Xinshu Shen | Peimin Yu | Gaowei Yi | Xinhao Chen | Tu Hu | Yang Chen | Yupei Ren | Yadong Zhang | Youqi Song | Binxuan Liu | Man Lan
Findings of the Association for Computational Linguistics ACL 2024

Topic relevance of an essay demands that the composition adheres to a clear theme and aligns well with the essay prompt requirements, a critical aspect of essay quality evaluation. However, existing research of Automatic Essay Scoring (AES) for Chinese essays has overlooked topic relevance and lacks detailed feedback, while Automatic Essay Comment Generation (AECG) faces much complexity and difficulty. Additionally, current Large Language Models, including GPT-4, often make incorrect judgments and provide overly impractical feedback when evaluating topic relevance. This paper introduces TOREE (Topic Relevance Evaluation), a comprehensive dataset developed to assess topic relevance in Chinese primary and middle school students’ essays, which is beneficial for AES, AECG and other applications. Moreover, our proposed two-step method utilizes TOREE through a combination of Supervised Fine-tuning and Preference Learning. Experimental results demonstrate that TOREE is of high quality, and our method significantly enhances models’ performance on two designed tasks for topic relevance evaluation, improving both automatic and human evaluations across four diverse LLMs.

2023

pdf bib
A Multi-Task Dataset for Assessing Discourse Coherence in Chinese Essays: Structure, Theme, and Logic Analysis
Hongyi Wu | Xinshu Shen | Man Lan | Shaoguang Mao | Xiaopeng Bai | Yuanbin Wu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

This paper introduces the Chinese Essay Discourse Coherence Corpus (CEDCC), a multi-task dataset for assessing discourse coherence. Existing research tends to focus on isolated dimensions of discourse coherence, a gap which the CEDCC addresses by integrating coherence grading, topical continuity, and discourse relations. This approach, alongside detailed annotations, captures the subtleties of real-world texts and stimulates progress in Chinese discourse coherence analysis. Our contributions include the development of the CEDCC, the establishment of baselines for further research, and the demonstration of the impact of coherence on discourse relation recognition and automated essay scoring. The dataset and related codes is available at https://github.com/cubenlp/CEDCC_corpus.

pdf bib
Overview of CCL23-Eval Task 8: Chinese Essay Fluency Evaluation (CEFE) Task
Xinshu Shen | Hongyi Wu | Xiaopeng Bai | Yuanbin Wu | Aimin Zhou | Shaoguang Mao | Tao Ge | Yan Xia
Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)

“This paper provides a comprehensive review of the CCL23-Eval Task 8, i.e., Chinese EssayFluency Evaluation (CEFE). The primary aim of this task is to systematically identify the typesof grammatical fine-grained errors that affect the readability and coherence of essays writtenby Chinese primary and secondary school students, and then to suggest suitable corrections toenhance the fluidity of their written expression. This task consists of three distinct tracks: (1)Coarse-grained and fine-grained error identification; (2) Character-level error identification andcorrection; (3) Error sentence rewriting. In the end, we received 44 completed registration forms,leading to a total of 130 submissions from 11 dedicated participating teams. We present theresults of all participants and our analysis of these results. Both the dataset and evaluation toolused in this task are available1.”