Rais Vaza Man Tazakka
2024
Indonesian-English Code-Switching Speech Recognition Using the Machine Speech Chain Based Semi-Supervised Learning
Rais Vaza Man Tazakka
|
Dessi Lestari
|
Ayu Purwarianti
|
Dipta Tanaya
|
Kurniawati Azizah
|
Sakriani Sakti
Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024
Indonesia is home to a diverse linguistic landscape, where individuals seamlessly transition between Indonesian, English, and local dialects in their everyday conversations—a phenomenon known as code-switching. Understanding and accommodating this linguistic fluidity is essential, particularly in the development of accurate speech recognition systems. However, tackling code-switching in Indonesian poses a challenge due to the scarcity of paired code-switching data. Thus, this study endeavors to address Indonesian-English code-switching in speech recognition, leveraging unlabeled data and employing a semi-supervised technique known as the machine speech chain. Our findings demonstrate that the machine speech chain method effectively enhances Automatic Speech Recognition (ASR) performance in recognizing code-switching between Indonesian and English, utilizing previously untapped resources of unlabeled data.