Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task

Marcin Junczys-Dowmunt; Roman Grundkiewicz; Shubha Guha; Kenneth Heafield

doi:10.18653/v1/N18-1055

Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task

Marcin Junczys-Dowmunt, Roman Grundkiewicz, Shubha Guha, Kenneth Heafield

Abstract

Previously, neural methods in grammatical error correction (GEC) did not reach state-of-the-art results compared to phrase-based statistical machine translation (SMT) baselines. We demonstrate parallels between neural GEC and low-resource neural MT and successfully adapt several methods from low-resource MT to neural GEC. We further establish guidelines for trustable results in neural GEC and propose a set of model-independent methods for neural GEC that can be easily applied in most GEC settings. Proposed methods include adding source-side noise, domain-adaptation techniques, a GEC-specific training-objective, transfer learning with monolingual data, and ensembling of independently trained GEC models and language models. The combined effects of these methods result in better than state-of-the-art neural GEC models that outperform previously best neural GEC systems by more than 10% M² on the CoNLL-2014 benchmark and 5.9% on the JFLEG test set. Non-neural state-of-the-art systems are outperformed by more than 2% on the CoNLL-2014 benchmark and by 4% on JFLEG.

Anthology ID:: N18-1055
Volume:: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)
Month:: June
Year:: 2018
Address:: New Orleans, Louisiana
Editors:: Marilyn Walker, Heng Ji, Amanda Stent
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 595–606
Language:
URL:: https://aclanthology.org/N18-1055/
DOI:: 10.18653/v1/N18-1055
Bibkey:
Cite (ACL):: Marcin Junczys-Dowmunt, Roman Grundkiewicz, Shubha Guha, and Kenneth Heafield. 2018. Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 595–606, New Orleans, Louisiana. Association for Computational Linguistics.
Cite (Informal):: Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task (Junczys-Dowmunt et al., NAACL 2018)
Copy Citation:
PDF:: https://aclanthology.org/N18-1055.pdf
Video:: https://aclanthology.org/N18-1055.mp4
Code: grammatical/neural-naacl2018
Data: CoNLL, CoNLL-2014 Shared Task: Grammatical Error Correction, FCE, JFLEG

PDF Cite Search Code Video Fix data