Multilingual Models for ASR in Chibchan Languages

Rolando Coto-Solano; Tai Wan Kim; Alexander Jones; Sharid Loáiciga

doi:10.18653/v1/2024.naacl-long.471

Multilingual Models for ASR in Chibchan Languages

Rolando Coto-Solano, Tai Wan Kim, Alexander Jones, Sharid Loáiciga

Abstract

We present experiments on Automatic Speech Recognition (ASR) for Bribri and Cabécar, two languages from the Chibchan family. We fine-tune four ASR algorithms (Wav2Vec2, Whisper, MMS & WavLM) to create monolingual models, with the Wav2Vec2 model demonstrating the best performance. We then proceed to use Wav2Vec2 for (1) experiments on training joint and transfer learning models for both languages, and (2) an analysis of the errors, with a focus on the transcription of tone. Results show effective transfer learning for both Bribri and Cabécar, but especially for Bribri. A post-processing spell checking step further reduced character and word error rates. As for the errors, tone is where the Bribri models make the most errors, whereas the simpler tonal system of Cabécar is better transcribed by the model. Our work contributes to developing better ASR technology, an important tool that could facilitate transcription, one of the major bottlenecks in language documentation efforts. Our work also assesses how existing pre-trained models and algorithms perform for genuine extremely low resource-languages.

Anthology ID:: 2024.naacl-long.471
Volume:: Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Kevin Duh, Helena Gomez, Steven Bethard
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8521–8535
Language:
URL:: https://aclanthology.org/2024.naacl-long.471/
DOI:: 10.18653/v1/2024.naacl-long.471
Bibkey:
Cite (ACL):: Rolando Coto-Solano, Tai Wan Kim, Alexander Jones, and Sharid Loáiciga. 2024. Multilingual Models for ASR in Chibchan Languages. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 8521–8535, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: Multilingual Models for ASR in Chibchan Languages (Coto-Solano et al., NAACL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.naacl-long.471.pdf
Video:: https://aclanthology.org/2024.naacl-long.471.mp4

PDF Cite Search Video Fix data