Automatic Transcription for Estonian Children’s Speech

Agnes Luhtaru, Rauno Jaaska, Karl Kruusamäe, Mark Fishel


Abstract
We evaluate the impact of recent improvements in Automatic Speech Recognition (ASR) on transcribing Estonian children’s speech. Our research focuses on fine-tuning large ASR models with a 10-hour Estonian children’s speech dataset to create accurate transcriptions. Our results show that large pre-trained models hold great potential when fine-tuned first with a more substantial Estonian adult speech corpus and then further trained with children’s speech.
Anthology ID:
2023.nodalida-1.70
Volume:
Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
Month:
May
Year:
2023
Address:
Tórshavn, Faroe Islands
Editors:
Tanel Alumäe, Mark Fishel
Venue:
NoDaLiDa
SIG:
Publisher:
University of Tartu Library
Note:
Pages:
705–709
Language:
URL:
https://aclanthology.org/2023.nodalida-1.70
DOI:
Bibkey:
Cite (ACL):
Agnes Luhtaru, Rauno Jaaska, Karl Kruusamäe, and Mark Fishel. 2023. Automatic Transcription for Estonian Children’s Speech. In Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), pages 705–709, Tórshavn, Faroe Islands. University of Tartu Library.
Cite (Informal):
Automatic Transcription for Estonian Children’s Speech (Luhtaru et al., NoDaLiDa 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.nodalida-1.70.pdf