Yufei Sun
2024
Modalities Should Be Appropriately Leveraged: Uncertainty Guidance for Multimodal Chinese Spelling Correction
Yongliang Lin
|
Zhen Zhang
|
Mengting Hu
|
Yufei Sun
|
Yuzhi Zhang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Chinese spelling correction (CSC) aims to detect and correct spelling errors in Chinese texts. Most spelling errors are phonetically or graphically similar to the correct ones. Thus, recent works introduce multimodal features to obtain achievements. In this paper, we found that different spelling errors have various biases to each modality, highlighting the importance of appropriately exploiting multimodal features. To achieve this goal, we propose the UGMSC framework, which incorporates uncertainty into both the feature learning and correction stages. Specifically, the UGMSC framework makes predictions with multimodal features and estimates the uncertainty of the corresponding modalities. Then it dynamically fuses the features of all modalities for model learning, and performs spelling correction under the uncertainty-guided strategy. Experimental results on three public datasets demonstrate that the proposed approach provides a significant improvement compared with previous strong multimodal models. The proposed framework is model-agnostic and can be easily applied to other multimodal models.