Grammatical Error Correction with Contrastive Learning in Low Error Density Domains

Hannan Cao, Wenmian Yang, Hwee Tou Ng


Abstract
Although grammatical error correction (GEC) has achieved good performance on texts written by learners of English as a second language, performance on low error density domains where texts are written by English speakers of varying levels of proficiency can still be improved. In this paper, we propose a contrastive learning approach to encourage the GEC model to assign a higher probability to a correct sentence while reducing the probability of incorrect sentences that the model tends to generate, so as to improve the accuracy of the model. Experimental results show that our approach significantly improves the performance of GEC models in low error density domains, when evaluated on the benchmark CWEB dataset.
Anthology ID:
2021.findings-emnlp.419
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2021
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
Findings
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
4867–4874
Language:
URL:
https://aclanthology.org/2021.findings-emnlp.419
DOI:
10.18653/v1/2021.findings-emnlp.419
Bibkey:
Cite (ACL):
Hannan Cao, Wenmian Yang, and Hwee Tou Ng. 2021. Grammatical Error Correction with Contrastive Learning in Low Error Density Domains. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 4867–4874, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Grammatical Error Correction with Contrastive Learning in Low Error Density Domains (Cao et al., Findings 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.findings-emnlp.419.pdf
Video:
 https://aclanthology.org/2021.findings-emnlp.419.mp4
Code
 nusnlp/geccl