A Neural Grammatical Error Correction System Built On Better Pre-training and Sequential Transfer Learning

Yo Joong Choe, Jiyeon Ham, Kyubyong Park, Yeoil Yoon


Abstract
Grammatical error correction can be viewed as a low-resource sequence-to-sequence task, because publicly available parallel corpora are limited.To tackle this challenge, we first generate erroneous versions of large unannotated corpora using a realistic noising function. The resulting parallel corpora are sub-sequently used to pre-train Transformer models. Then, by sequentially applying transfer learning, we adapt these models to the domain and style of the test set. Combined with a context-aware neural spellchecker, our system achieves competitive results in both restricted and low resource tracks in ACL 2019 BEAShared Task. We release all of our code and materials for reproducibility.
Anthology ID:
W19-4423
Volume:
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications
Month:
August
Year:
2019
Address:
Florence, Italy
Venues:
ACL | BEA | WS
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
213–227
Language:
URL:
https://aclanthology.org/W19-4423
DOI:
10.18653/v1/W19-4423
Bibkey:
Cite (ACL):
Yo Joong Choe, Jiyeon Ham, Kyubyong Park, and Yeoil Yoon. 2019. A Neural Grammatical Error Correction System Built On Better Pre-training and Sequential Transfer Learning. In Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 213–227, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
A Neural Grammatical Error Correction System Built On Better Pre-training and Sequential Transfer Learning (Choe et al., 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-4423.pdf
Code
 kakaobrain/helo_word
Data
WI-LOCNESSWikiText-103