Standardising Pronunciation for a Grapheme-to-Phoneme Converter for Faroese

Sandra Lamhauge, Iben Debess, Carlos Hernández Mena, Annika Simonsen, Jon Gudnason


Abstract
Pronunciation dictionaries allow computational modelling of the pronunciation of words in a certain language and are widely used in speech technologies, especially in the fields of speech recognition and synthesis. On the other hand, a grapheme-to-phoneme tool is a generalization of a pronunciation dictionary that is not limited to a given and finite vocabulary. In this paper, we present a set of standardized phonological rules for the Faroese language; we introduce FARSAMPA, a machine-readable character set suitable for phonetic transcription of Faroese, and we present a set of grapheme-to-phoneme models for Faroese, which are publicly available and shared under a creative commons license. We present the G2P converter and evaluate the performance. The evaluation shows reliable results that demonstrate the quality of the data.
Anthology ID:
2023.nodalida-1.32
Volume:
Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
Month:
May
Year:
2023
Address:
Tórshavn, Faroe Islands
Editors:
Tanel Alumäe, Mark Fishel
Venue:
NoDaLiDa
SIG:
Publisher:
University of Tartu Library
Note:
Pages:
308–317
Language:
URL:
https://aclanthology.org/2023.nodalida-1.32
DOI:
Bibkey:
Cite (ACL):
Sandra Lamhauge, Iben Debess, Carlos Hernández Mena, Annika Simonsen, and Jon Gudnason. 2023. Standardising Pronunciation for a Grapheme-to-Phoneme Converter for Faroese. In Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), pages 308–317, Tórshavn, Faroe Islands. University of Tartu Library.
Cite (Informal):
Standardising Pronunciation for a Grapheme-to-Phoneme Converter for Faroese (Lamhauge et al., NoDaLiDa 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.nodalida-1.32.pdf