CopyNE: Better Contextual ASR by Copying Named Entities

Shilin Zhou (周仕林); Zhenghua Li; Yu Hong; Min Zhang; Zhefeng Wang; Baoxing Huai

doi:10.18653/v1/2024.acl-long.147

CopyNE: Better Contextual ASR by Copying Named Entities

Shilin Zhou, Zhenghua Li, Yu Hong, Min Zhang, Zhefeng Wang, Baoxing Huai

Abstract

End-to-end automatic speech recognition (ASR) systems have made significant progress in general scenarios. However, it remains challenging to transcribe contextual named entities (NEs) in the contextual ASR scenario. Previous approaches have attempted to address this by utilizing the NE dictionary. These approaches treat entities as individual tokens and generate them token-by-token, which may result in incomplete transcriptions of entities. In this paper, we treat entities as indivisible wholes and introduce the idea of copying into ASR. We design a systematic mechanism called CopyNE, which can copy entities from the NE dictionary. By copying all tokens of an entity at once, we can reduce errors during entity transcription, ensuring the completeness of the entity. Experiments demonstrate that CopyNE consistently improves the accuracy of transcribing entities compared to previous approaches. Even when based on the strong Whisper, CopyNE still achieves notable improvements.

Anthology ID:: 2024.acl-long.147
Volume:: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: August
Year:: 2024
Address:: Bangkok, Thailand
Editors:: Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2675–2686
Language:
URL:: https://aclanthology.org/2024.acl-long.147
DOI:: 10.18653/v1/2024.acl-long.147
Bibkey:
Cite (ACL):: Shilin Zhou, Zhenghua Li, Yu Hong, Min Zhang, Zhefeng Wang, and Baoxing Huai. 2024. CopyNE: Better Contextual ASR by Copying Named Entities. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2675–2686, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):: CopyNE: Better Contextual ASR by Copying Named Entities (Zhou et al., ACL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.acl-long.147.pdf

PDF Cite Search