ManWav: The First Manchu ASR Model

Jean Seo, Minha Kang, SungJoo Byun, Sangah Lee


Abstract
This study addresses the widening gap in Automatic Speech Recognition (ASR) research between high resource and extremely low resource languages, with a particular focus on Manchu, a severely endangered language. Manchu exemplifies the challenges faced by marginalized linguistic communities in accessing state-of-the-art technologies. In a pioneering effort, we introduce the first-ever Manchu ASR model ManWav, leveraging Wav2Vec2-XLSR-53. The results of the first Manchu ASR is promising, especially when trained with our augmented data. Wav2Vec2-XLSR-53 fine-tuned with augmented data demonstrates a 0.02 drop in CER and 0.13 drop in WER compared to the same base model fine-tuned with original data.
Anthology ID:
2024.fieldmatters-1.2
Volume:
Proceedings of the 3rd Workshop on NLP Applications to Field Linguistics (Field Matters 2024)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Oleg Serikov, Ekaterina Voloshina, Anna Postnikova, Saliha Muradoglu, Eric Le Ferrand, Elena Klyachko, Ekaterina Vylomova, Tatiana Shavrina, Francis Tyers
Venues:
FieldMatters | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6–11
Language:
URL:
https://aclanthology.org/2024.fieldmatters-1.2
DOI:
10.18653/v1/2024.fieldmatters-1.2
Bibkey:
Cite (ACL):
Jean Seo, Minha Kang, SungJoo Byun, and Sangah Lee. 2024. ManWav: The First Manchu ASR Model. In Proceedings of the 3rd Workshop on NLP Applications to Field Linguistics (Field Matters 2024), pages 6–11, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
ManWav: The First Manchu ASR Model (Seo et al., FieldMatters-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.fieldmatters-1.2.pdf