An Error-Guided Correction Model for Chinese Spelling Error Correction

Rui Sun, Xiuyu Wu, Yunfang Wu


Abstract
Although existing neural network approaches have achieved great progress on Chinese spelling correction, there is still room to improve. The model is required to avoid over-correction and to distinguish a correct token from its phonological and visual similar ones. In this paper, we propose an error-guided correction model to address these issues. By borrowing the powerful ability of the pre-trained BERT model, we propose a novel zero-shot error detection method to do a preliminary detection, which guides our model to attend more on the probably wrong tokens in encoding and to avoid modifying the correct tokens in generating. Furthermore, we introduce a new loss function to integrate the error confusion set, which enables our model to distinguish similar tokens. Moreover, our model supports highly parallel decoding to meet real applications. Experiments are conducted on widely used benchmarks. Our model achieves superior performance against state-of-the-art approaches by a remarkable margin, on both the quality and computation speed.
Anthology ID:
2022.findings-emnlp.278
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2022
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3800–3810
Language:
URL:
https://aclanthology.org/2022.findings-emnlp.278
DOI:
10.18653/v1/2022.findings-emnlp.278
Bibkey:
Cite (ACL):
Rui Sun, Xiuyu Wu, and Yunfang Wu. 2022. An Error-Guided Correction Model for Chinese Spelling Error Correction. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 3800–3810, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
An Error-Guided Correction Model for Chinese Spelling Error Correction (Sun et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-emnlp.278.pdf