(Almost) Unsupervised Grammatical Error Correction using Synthetic Comparable Corpus

Satoru Katsumata, Mamoru Komachi


Abstract
We introduce unsupervised techniques based on phrase-based statistical machine translation for grammatical error correction (GEC) trained on a pseudo learner corpus created by Google Translation. We verified our GEC system through experiments on a low resource track of the shared task at BEA2019. As a result, we achieved an F0.5 score of 28.31 points with the test data.
Anthology ID:
W19-4413
Volume:
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications
Month:
August
Year:
2019
Address:
Florence, Italy
Venues:
ACL | BEA | WS
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
134–138
Language:
URL:
https://aclanthology.org/W19-4413
DOI:
10.18653/v1/W19-4413
Bibkey:
Cite (ACL):
Satoru Katsumata and Mamoru Komachi. 2019. (Almost) Unsupervised Grammatical Error Correction using Synthetic Comparable Corpus. In Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 134–138, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
(Almost) Unsupervised Grammatical Error Correction using Synthetic Comparable Corpus (Katsumata & Komachi, 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-4413.pdf
Data
Billion Word Benchmark