Neural and FST-based approaches to grammatical error correction

Zheng Yuan, Felix Stahlberg, Marek Rei, Bill Byrne, Helen Yannakoudakis


Abstract
In this paper, we describe our submission to the BEA 2019 shared task on grammatical error correction. We present a system pipeline that utilises both error detection and correction models. The input text is first corrected by two complementary neural machine translation systems: one using convolutional networks and multi-task learning, and another using a neural Transformer-based system. Training is performed on publicly available data, along with artificial examples generated through back-translation. The n-best lists of these two machine translation systems are then combined and scored using a finite state transducer (FST). Finally, an unsupervised re-ranking system is applied to the n-best output of the FST. The re-ranker uses a number of error detection features to re-rank the FST n-best list and identify the final 1-best correction hypothesis. Our system achieves 66.75% F 0.5 on error correction (ranking 4th), and 82.52% F 0.5 on token-level error detection (ranking 2nd) in the restricted track of the shared task.
Anthology ID:
W19-4424
Volume:
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications
Month:
August
Year:
2019
Address:
Florence, Italy
Venues:
ACL | BEA | WS
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
228–239
Language:
URL:
https://aclanthology.org/W19-4424
DOI:
10.18653/v1/W19-4424
Bibkey:
Cite (ACL):
Zheng Yuan, Felix Stahlberg, Marek Rei, Bill Byrne, and Helen Yannakoudakis. 2019. Neural and FST-based approaches to grammatical error correction. In Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 228–239, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Neural and FST-based approaches to grammatical error correction (Yuan et al., 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-4424.pdf
Data
Billion Word BenchmarkFCE