Multi-Class Grammatical Error Detection for Correction: A Tale of Two Systems

Zheng Yuan, Shiva Taslimipoor, Christopher Davis, Christopher Bryant


Abstract
In this paper, we show how a multi-class grammatical error detection (GED) system can be used to improve grammatical error correction (GEC) for English. Specifically, we first develop a new state-of-the-art binary detection system based on pre-trained ELECTRA, and then extend it to multi-class detection using different error type tagsets derived from the ERRANT framework. Output from this detection system is used as auxiliary input to fine-tune a novel encoder-decoder GEC model, and we subsequently re-rank the N-best GEC output to find the hypothesis that most agrees with the GED output. Results show that fine-tuning the GEC system using 4-class GED produces the best model, but re-ranking using 55-class GED leads to the best performance overall. This suggests that different multi-class GED systems benefit GEC in different ways. Ultimately, our system outperforms all other previous work that combines GED and GEC, and achieves a new single-model NMT-based state of the art on the BEA-test benchmark.
Anthology ID:
2021.emnlp-main.687
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8722–8736
Language:
URL:
https://aclanthology.org/2021.emnlp-main.687
DOI:
10.18653/v1/2021.emnlp-main.687
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.687.pdf
Data
FCE