Unsupervised Grammatical Error Correction Rivaling Supervised Methods

Hannan Cao, Liping Yuan, Yuchen Zhang, Hwee Tou Ng


Abstract
State-of-the-art grammatical error correction (GEC) systems rely on parallel training data (ungrammatical sentences and their manually corrected counterparts), which are expensive to construct. In this paper, we employ the Break-It-Fix-It (BIFI) method to build an unsupervised GEC system. The BIFI framework generates parallel data from unlabeled text using a fixer to transform ungrammatical sentences into grammatical ones, and a critic to predict sentence grammaticality. We present an unsupervised approach to build the fixer and the critic, and an algorithm that allows them to iteratively improve each other. We evaluate our unsupervised GEC system on English and Chinese GEC. Empirical results show that our GEC system outperforms previous unsupervised GEC systems, and achieves performance comparable to supervised GEC systems without ensemble. Furthermore, when combined with labeled training data, our system achieves new state-of-the-art results on the CoNLL-2014 and NLPCC-2018 test sets.
Anthology ID:
2023.emnlp-main.185
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3072–3088
Language:
URL:
https://aclanthology.org/2023.emnlp-main.185
DOI:
10.18653/v1/2023.emnlp-main.185
Bibkey:
Cite (ACL):
Hannan Cao, Liping Yuan, Yuchen Zhang, and Hwee Tou Ng. 2023. Unsupervised Grammatical Error Correction Rivaling Supervised Methods. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 3072–3088, Singapore. Association for Computational Linguistics.
Cite (Informal):
Unsupervised Grammatical Error Correction Rivaling Supervised Methods (Cao et al., EMNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.emnlp-main.185.pdf
Video:
 https://aclanthology.org/2023.emnlp-main.185.mp4