Text-Derived Language Identity Incorporation for End-to-End Code-Switching Speech Recognition

Qinyi Wang; Haizhou Li

Text-Derived Language Identity Incorporation for End-to-End Code-Switching Speech Recognition

Abstract

Recognizing code-switching (CS) speech often presents challenges for an automatic speech recognition system (ASR) due to limited linguistic context in short monolingual segments, resulting in language confusion. To mitigate this issue, language identity (LID) is often integrated into the speech recognition system to provide additional linguistic context. However, previous works predominately focus on extracting language identity from speech signals. We introduce a novel approach to learn language identity from pure text data via a dedicated language identity-language model. Besides, we explore two strategies: LID state fusion and language posterior biasing, to integrate the text-derived language identities into the end-to-end ASR system. By incorporating hypothesized language identities, our ASR system gains crucial contextual cues, effectively capturing language transitions and patterns within code-switched utterances. We conduct speech recognition experiments on the SEAME corpus and demonstrate the effectiveness of our proposed methods. Our results reveal significantly improved transcriptions in code-switching scenarios, underscoring the potential of text-derived LID in enhancing code-switching speech recognition.

Anthology ID:: 2023.calcs-1.4
Volume:: Proceedings of the 6th Workshop on Computational Approaches to Linguistic Code-Switching
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Genta Winata, Sudipta Kar, Marina Zhukova, Thamar Solorio, Mona Diab, Sunayana Sitaram, Monojit Choudhury, Kalika Bali
Venue:: CALCS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 33–42
Language:
URL:: https://aclanthology.org/2023.calcs-1.4/
DOI:
Bibkey:
Cite (ACL):: Qinyi Wang and Haizhou Li. 2023. Text-Derived Language Identity Incorporation for End-to-End Code-Switching Speech Recognition. In Proceedings of the 6th Workshop on Computational Approaches to Linguistic Code-Switching, pages 33–42, Singapore. Association for Computational Linguistics.
Cite (Informal):: Text-Derived Language Identity Incorporation for End-to-End Code-Switching Speech Recognition (Wang & Li, CALCS 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.calcs-1.4.pdf

PDF Cite Search Fix data