Non-Autoregressive Chinese ASR Error Correction with Phonological Training

Zheng Fang, Ruiqing Zhang, Zhongjun He, Hua Wu, Yanan Cao


Abstract
Automatic Speech Recognition (ASR) is an efficient and widely used input method that transcribes speech signals into text. As the errors introduced by ASR systems will impair the performance of downstream tasks, we introduce a post-processing error correction method, PhVEC, to correct errors in text space. For the errors in ASR result, existing works mainly focus on fixed-length corrections, modifying each wrong token to a correct one (one-to-one correction), but rarely consider the variable-length correction (one-to-many or many-to-one correction). In this paper, we propose an efficient non-autoregressive (NAR) method for Chinese ASR error correction for both cases. Instead of conventionally predicting the sentence length in NAR methods, we propose a novel approach that uses phonological tokens to extend the source sentence for variable-length correction, enabling our model to generate phonetically similar corrections. Experimental results on datasets of different domains show that our method achieves significant improvement in word error rate reduction and speeds up the inference by 6.2 times compared with the autoregressive model.
Anthology ID:
2022.naacl-main.432
Volume:
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
July
Year:
2022
Address:
Seattle, United States
Editors:
Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5907–5917
Language:
URL:
https://aclanthology.org/2022.naacl-main.432
DOI:
10.18653/v1/2022.naacl-main.432
Bibkey:
Cite (ACL):
Zheng Fang, Ruiqing Zhang, Zhongjun He, Hua Wu, and Yanan Cao. 2022. Non-Autoregressive Chinese ASR Error Correction with Phonological Training. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5907–5917, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
Non-Autoregressive Chinese ASR Error Correction with Phonological Training (Fang et al., NAACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.naacl-main.432.pdf
Software:
 2022.naacl-main.432.software.zip
Video:
 https://aclanthology.org/2022.naacl-main.432.mp4
Data
AISHELL-1