Gunnar Örnólfsson


2024

pdf bib
Speech and Language Biomarkers of Neurodegenerative Conditions: Developing Cross-Linguistically Valid Tools for Automatic Analysis
Iris E. Nowenstein | Marija Stanojevic | Gunnar Örnólfsson | María Kristín Jónsdóttir | Bill Simpson | Jennifer Sorinas Nerin | Bryndís Bergþórsdóttir | Kristín Hannesdóttir | Jekaterina Novikova | Jelena Curcic
Proceedings of the Fifth Workshop on Resources and ProcessIng of linguistic, para-linguistic and extra-linguistic Data from people with various forms of cognitive/psychiatric/developmental impairments @LREC-COLING 2024

In the last decade, a rapidly growing body of studies has shown promising results for the automatic detection and extraction of speech and language features as biomarkers of neurodegenerative conditions such as Alzheimer’s disease. This has sparked great optimism and the development of various digital health tools, but also warnings regarding the predominance of English in the field and calls for linguistically diverse research as well as global, equitable access to novel clinical instruments. To automatically extract clinically relevant features from transcripts in low-resource languages, two approaches are possible: 1) utilizing a limited range of language-specific tools or 2) translating text to English and then extracting the features. We evaluate these approaches for part-of-speech (POS) rates in transcripts of recorded picture descriptions from a cross-sectional study of Icelandic speakers at different stages of Alzheimer’s disease and healthy controls. While the translation method merits further exploration, only a subset of the POS categories show a promising correspondence to the direct extraction from the Icelandic transcripts in our results, indicating that the translation method has to be linguistically validated at the individual POS category level.

2021

pdf bib
Talrómur: A large Icelandic TTS corpus
Atli Sigurgeirsson | Þorsteinn Gunnarsson | Gunnar Örnólfsson | Eydís Magnúsdóttir | Ragnheiður Þórhallsdóttir | Stefán Jónsson | Jón Guðnason
Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)

We present Talrómur, a large high-quality Text-To-Speech (TTS) corpus for the Icelandic language. This multi-speaker corpus contains recordings from 4 male speakers and 4 female speakers of a wide range in age and speaking style. The corpus consists of 122,417 single utterance recordings equating to approximately 213 hours of voice data. All speakers read from the same script which has a high coverage of possible Icelandic diphones. Manual analysis of 15,956 utterances indicates that the corpus has a reading mistake rate no higher than 0.25%. We additionally present results from subjective evaluations of the different voices with regards to intelligibility, likeability and trustworthiness.

2020

pdf bib
Manual Speech Synthesis Data Acquisition - From Script Design to Recording Speech
Atli Sigurgeirsson | Gunnar Örnólfsson | Jón Guðnason
Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL)

Atli Þór Sigurgeirsson, atlithors@ru.is, Reykjavik University Gunnar Thor Örnólfsson, gunnarthor@hi.is, Árni Magnússon institute of Icelandic studies Dr. Jón Guðnason, jg@ru.is In this paper we present the work of collecting a large amount of high quality speech synthesis data for Icelandic. 8 speakers will be recorded for 20 hours each. A script design strategy is proposed and three scripts have been generated to maximize diphone coverage, varying in length. The largest reading script contains 14,400 prompts and includes 87.3% of all Icelandic diphones at least once and 81% of all Icelandic diphones at least twenty times. A recording client was developed to facilitate recording sessions. The client supports easily importing scripts and maintaining multiple collections in parallel. The recorded data can be downloaded straight from the client. Recording sessions are carried out in a professional studio under supervision and started October of 2019. As of writing, 58.7 hours of high quality speech data has been collected. The scripts, the recording software and the speech data will later be released under a CC-BY 4.0 license.