Contextual Modeling for Document-level ASR Error Correction

Jin Jiang, Xunjian Yin, Xiaojun Wan, Wei Peng, Rongjun Li, Jingyuan Yang, Yanquan Zhou


Abstract
Contextual information, including the sentences in the same document and in other documents of the dataset, plays a crucial role in improving the accuracy of document-level ASR Error Correction (AEC), while most previous works ignore this. In this paper, we propose a context-aware method that utilizes a k-Nearest Neighbors (kNN) approach to enhance the AEC model by retrieving a datastore containing contextual information. We conduct experiments on two English and two Chinese datasets, and the results demonstrate that our proposed model can effectively utilize contextual information to improve document-level AEC. Furthermore, the context information from the whole dataset provides even better results.
Anthology ID:
2024.lrec-main.341
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
3855–3867
Language:
URL:
https://aclanthology.org/2024.lrec-main.341
DOI:
Bibkey:
Cite (ACL):
Jin Jiang, Xunjian Yin, Xiaojun Wan, Wei Peng, Rongjun Li, Jingyuan Yang, and Yanquan Zhou. 2024. Contextual Modeling for Document-level ASR Error Correction. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 3855–3867, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Contextual Modeling for Document-level ASR Error Correction (Jiang et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.341.pdf