Fine-Tuning ASR models for Very Low-Resource Languages: A Study on Mvskoke

Julia Mainzinger, Gina-Anne Levow


Abstract
Recent advancements in multilingual models for automatic speech recognition (ASR) have been able to achieve a high accuracy for languages with extremely limited resources. This study examines ASR modeling for the Mvskoke language, an indigenous language of America. The parameter efficiency of adapter training is contrasted with training entire models, and it is demonstrated how performance varies with different amounts of data. Additionally, the models are evaluated with trigram language model decoding, and the outputs are compared across different types of speech recordings. Results show that training an adapter is both parameter efficient and gives higher accuracy for a relatively small amount of data.
Anthology ID:
2024.acl-srw.16
Volume:
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Xiyan Fu, Eve Fleisig
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
170–176
Language:
URL:
https://aclanthology.org/2024.acl-srw.16
DOI:
Bibkey:
Cite (ACL):
Julia Mainzinger and Gina-Anne Levow. 2024. Fine-Tuning ASR models for Very Low-Resource Languages: A Study on Mvskoke. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 170–176, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Fine-Tuning ASR models for Very Low-Resource Languages: A Study on Mvskoke (Mainzinger & Levow, ACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.acl-srw.16.pdf