Rich Semantic Knowledge Enhanced Large Language Models for Few-shot Chinese Spell Checking

Ming Dong, Yujing Chen, Miao Zhang, Hao Sun, Tingting He


Abstract
Chinese Spell Checking (CSC) is a widely used technology, which plays a vital role in speech to text (STT) and optical character recognition (OCR). Most of the existing CSC approaches relying on BERT architecture achieve excellent performance. However, limited by the scale of the foundation model, BERT-based method does not work well in few-shot scenarios, showing certain limitations in practical applications. In this paper, we explore using an in-context learning method named RS-LLM (Rich\ Semantic\ based\ LLMs\) to introduce large language models (LLMs) as the foundation model. Besides, we study the impact of introducing various Chinese rich semantic information in our framework. We found that by introducing a small number of specific Chinese rich semantic structures, LLMs achieve better performance than most of the BERT-based model on few-shot CSC task. Furthermore, we conduct experiments on multiple datasets, and the experimental results verified the superiority of our proposed framework.
Anthology ID:
2024.findings-acl.439
Volume:
Findings of the Association for Computational Linguistics ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand and virtual meeting
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7372–7383
Language:
URL:
https://aclanthology.org/2024.findings-acl.439
DOI:
Bibkey:
Cite (ACL):
Ming Dong, Yujing Chen, Miao Zhang, Hao Sun, and Tingting He. 2024. Rich Semantic Knowledge Enhanced Large Language Models for Few-shot Chinese Spell Checking. In Findings of the Association for Computational Linguistics ACL 2024, pages 7372–7383, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
Rich Semantic Knowledge Enhanced Large Language Models for Few-shot Chinese Spell Checking (Dong et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.439.pdf