Voice Activation Detection for Transcription of Indigenous Languages

Rolando Coto-Solano; Mikaela Browning; Thomas Corrado; Sally Akevai Nicholas

Voice Activation Detection for Transcription of Indigenous Languages

Rolando Coto-Solano, Mikaela Browning, Thomas Corrado, Sally Akevai Nicholas

Abstract

Voice Activity Detection (VAD) is the first step in a workflow intended for the automated transcription of Indigenous and low-resource languages. However, VAD’s effectiveness when detecting voices in fieldwork settings remains untested. Fieldwork recordings have very different noise and interference conditions from the datasets that mainstream VAD models have been trained for, and so they might fail when confronted with this type of linguistic data. This paper tests different algorithms using data from two typologically distinct Indigenous languages: Bribri from Costa Rica and Cook Islands Māori from Polynesia. We compare energy-based methods (PyDub), GMM-based methods (WebRTC VAD), and two neural-network based methods (Silero and SpeechBrain) against human-annotated transcriptions. Our results indicate that hybrid architectures like that of SpeechBrain obtain the best results (89% accuracy for Bribri and 94% for Cook Islands Māori). However, no system performed well when tagging non-speech segments, which might indicate a bias towards marking the natural noise in a fieldwork setting as a false-positive for voice. With these findings we hope to inform the selection of VAD tools when implementing ASR workflows.

Anthology ID:: 2026.computel-1.19
Volume:: Proceedings of the Ninth Workshop on the Use of Computational Methods in the Study of Endangered Languages (ComputEL-9)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Godfred Agyapong, Sarah Moeller, Antti Arppe, Ali Marashian, Daisy Rosenblum
Venues:: ComputEL | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 177–184
Language:
URL:: https://aclanthology.org/2026.computel-1.19/
DOI:
Bibkey:
Cite (ACL):: Rolando Coto-Solano, Mikaela Browning, Thomas Corrado, and Sally Akevai Nicholas. 2026. Voice Activation Detection for Transcription of Indigenous Languages. In Proceedings of the Ninth Workshop on the Use of Computational Methods in the Study of Endangered Languages (ComputEL-9), pages 177–184, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: Voice Activation Detection for Transcription of Indigenous Languages (Coto-Solano et al., ComputEL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.computel-1.19.pdf
Supplementarymaterial:: 2026.computel-1.19.SupplementaryMaterial.txt

PDF Cite Search Supplementarymaterial Fix data