Multilingual self-supervised speech representations improve the speech recognition of low-resource African languages with codeswitching

Tolúlọpẹ́ Ògúnrẹ̀mí; Christopher D. Manning; Dan Jurafsky

Multilingual self-supervised speech representations improve the speech recognition of low-resource African languages with codeswitching

Tolúlọpẹ́ Ògúnrẹ̀mí, Christopher D. Manning, Dan Jurafsky

Abstract

While many speakers of low-resource languages regularly code-switch between their languages and other regional languages or English, datasets of codeswitched speech are too small to train bespoke acoustic models from scratch or do language model rescoring. Here we propose finetuning self-supervised speech representations such as wav2vec 2.0 XLSR to recognize code-switched data. We find that finetuning self-supervised multilingual representations and augmenting them with n-gram language models trained from transcripts reduces absolute word error rates by up to 20% compared to baselines of hybrid models trained from scratch on code-switched data. Our findings suggest that in circumstances with limited training data finetuning self-supervised representations is a better performing and viable solution.

Anthology ID:: 2023.calcs-1.8
Volume:: Proceedings of the 6th Workshop on Computational Approaches to Linguistic Code-Switching
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Genta Winata, Sudipta Kar, Marina Zhukova, Thamar Solorio, Mona Diab, Sunayana Sitaram, Monojit Choudhury, Kalika Bali
Venue:: CALCS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 83–88
Language:
URL:: https://aclanthology.org/2023.calcs-1.8/
DOI:
Bibkey:
Cite (ACL):: Tolúlọpẹ́ Ògúnrẹ̀mí, Christopher D. Manning, and Dan Jurafsky. 2023. Multilingual self-supervised speech representations improve the speech recognition of low-resource African languages with codeswitching. In Proceedings of the 6th Workshop on Computational Approaches to Linguistic Code-Switching, pages 83–88, Singapore. Association for Computational Linguistics.
Cite (Informal):: Multilingual self-supervised speech representations improve the speech recognition of low-resource African languages with codeswitching (Ògúnrẹ̀mí et al., CALCS 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.calcs-1.8.pdf

PDF Cite Search Fix data