Jiayi Deng


2025

pdf bib
InstructGEC: Enhancing Unsupervised Grammatical Error Correction with Instruction Tuning
Jiayi Deng | Chen Chen | Chunyan Hou | Xiaojie Yuan
Proceedings of the 31st International Conference on Computational Linguistics

Recent works have proposed methods of generating synthetic data automatically for unsupervised Grammatical Error Correction (GEC). Although a large amount of synthetic data is generated at a low cost, it is unrealistic and of poor quality. The copying phenomenon of synthetic data prevents GEC models from learning the semantic knowledge of contextual language. In this paper, we design an instruction format and use the masking strategy in both an erroneous sentence and the corresponding instruction consistently to alleviate the impact of the copy phenomenon. We also propose a novel approach, InstructGEC, which integrates the knowledge of grammatical detection into GEC models with instruction tuning to address the low-quality issue. Experiments are conducted on English and Chinese GEC datasets and results demonstrate that our method outperforms state-of-the-art unsupervised GEC methods.