Automatic Transcription of Grammaticality Judgements for Language Documentation

Éric Le Ferrand, Emily Prud’hommeaux


Abstract
Descriptive linguistics is a sub-field of linguistics that involves the collection and annotationof language resources to describe linguistic phenomena. The transcription of these resources is often described as a tedious task, and Automatic Speech Recognition (ASR) has frequently been employed to support this process. However, the typical research approach to ASR in documentary linguistics often only captures a subset of the field’s diverse reality. In this paper, we focus specifically on one type of data known as grammaticality judgment elicitation in the context of documenting Kréyòl Gwadloupéyen. We show that only a few minutes of speech is enough to fine-tune a model originally trained in French to transcribe segments in Kréyol.
Anthology ID:
2024.computel-1.6
Volume:
Proceedings of the Seventh Workshop on the Use of Computational Methods in the Study of Endangered Languages
Month:
March
Year:
2024
Address:
St. Julians, Malta
Editors:
Sarah Moeller, Godfred Agyapong, Antti Arppe, Aditi Chaudhary, Shruti Rijhwani, Christopher Cox, Ryan Henke, Alexis Palmer, Daisy Rosenblum, Lane Schwartz
Venues:
ComputEL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
33–38
Language:
URL:
https://aclanthology.org/2024.computel-1.6
DOI:
Bibkey:
Cite (ACL):
Éric Le Ferrand and Emily Prud’hommeaux. 2024. Automatic Transcription of Grammaticality Judgements for Language Documentation. In Proceedings of the Seventh Workshop on the Use of Computational Methods in the Study of Endangered Languages, pages 33–38, St. Julians, Malta. Association for Computational Linguistics.
Cite (Informal):
Automatic Transcription of Grammaticality Judgements for Language Documentation (Le Ferrand & Prud’hommeaux, ComputEL-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.computel-1.6.pdf