(Almost) Unsupervised Grammatical Error Correction using Synthetic Comparable Corpus

Satoru Katsumata, Mamoru Komachi


Abstract
We introduce unsupervised techniques based on phrase-based statistical machine translation for grammatical error correction (GEC) trained on a pseudo learner corpus created by Google Translation. We verified our GEC system through experiments on a low resource track of the shared task at BEA2019. As a result, we achieved an F0.5 score of 28.31 points with the test data.
Anthology ID:
W19-4413
Volume:
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Helen Yannakoudakis, Ekaterina Kochmar, Claudia Leacock, Nitin Madnani, Ildikó Pilán, Torsten Zesch
Venue:
BEA
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
134–138
Language:
URL:
https://aclanthology.org/W19-4413
DOI:
10.18653/v1/W19-4413
Bibkey:
Cite (ACL):
Satoru Katsumata and Mamoru Komachi. 2019. (Almost) Unsupervised Grammatical Error Correction using Synthetic Comparable Corpus. In Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 134–138, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
(Almost) Unsupervised Grammatical Error Correction using Synthetic Comparable Corpus (Katsumata & Komachi, BEA 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-4413.pdf
Data
Billion Word BenchmarkOne Billion Word Benchmark