Correcting Pronoun Homophones with Subtle Semantics in Chinese Speech Recognition

Zhaobo Zhang, Rui Gan, Pingpeng Yuan, Hai Jin


Abstract
Speech recognition is becoming prevalent in daily life. However, due to the similar semantic context of the entities and the overlap of Chinese pronunciation, the pronoun homophone, especially “他/她/它 (he/she/it)”, (their pronunciation is “Tā”) is usually recognized incorrectly. It poses a challenge to automatically correct them during the post-processing of Chinese speech recognition. In this paper, we propose three models to address the common confusion issues in this domain, tailored to various application scenarios. We implement the language model, the LSTM model with semantic features, and the rule-based assisted Ngram model, enabling our models to adapt to a wide range of requirements, from high-precision to low-resource offline devices. The extensive experiments show that our models achieve the highest recognition rate for “Tā” correction with improvements from 70% in the popular voice input methods up to 90%. Further ablation analysis underscores the effectiveness of our models in enhancing recognition accuracy. Therefore, our models improve the overall experience of Chinese speech recognition of “Tā” and reduce the burden of manual transcription corrections.
Anthology ID:
2024.lrec-main.360
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
4047–4058
Language:
URL:
https://aclanthology.org/2024.lrec-main.360
DOI:
Bibkey:
Cite (ACL):
Zhaobo Zhang, Rui Gan, Pingpeng Yuan, and Hai Jin. 2024. Correcting Pronoun Homophones with Subtle Semantics in Chinese Speech Recognition. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 4047–4058, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Correcting Pronoun Homophones with Subtle Semantics in Chinese Speech Recognition (Zhang et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.360.pdf