A Database of Laryngeal High-Speed Videos with Simultaneous High-Quality Audio Recordings of Pathological and Non-Pathological Voices

Philipp Aichinger; Immer Roesner; Matthias Leonhard; Doris-Maria Denk-Linnert; Wolfgang Bigenzahn; Berit Schneider-Stickler

A Database of Laryngeal High-Speed Videos with Simultaneous High-Quality Audio Recordings of Pathological and Non-Pathological Voices

Philipp Aichinger, Immer Roesner, Matthias Leonhard, Doris-Maria Denk-Linnert, Wolfgang Bigenzahn, Berit Schneider-Stickler

Abstract

Auditory voice quality judgements are used intensively for the clinical assessment of pathological voice. Voice quality concepts are fuzzily defined and poorly standardized however, which hinders scientific and clinical communication. The described database documents a wide variety of pathologies and is used to investigate auditory voice quality concepts with regard to phonation mechanisms. The database contains 375 laryngeal high-speed videos and simultaneous high-quality audio recordings of sustained phonations of 80 pathological and 40 non-pathological subjects. Interval wise annotations regarding video and audio quality, as well as voice quality ratings are provided. Video quality is annotated for the visibility of anatomical structures and artefacts such as blurring or reduced contrast. Voice quality annotations include ratings on the presence of dysphonia and diplophonia. The purpose of the database is to aid the formulation of observationally well-founded models of phonation and the development of model-based automatic detectors for distinct types of phonation, especially for clinically relevant nonmodal voice phenomena. Another application is the training of audio-based fundamental frequency extractors on video-based reference fundamental frequencies.

Anthology ID:: L16-1122
Volume:: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:: May
Year:: 2016
Address:: Portorož, Slovenia
Editors:: Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:: LREC
SIG:
Publisher:: European Language Resources Association (ELRA)
Note:
Pages:: 767–770
Language:
URL:: https://aclanthology.org/L16-1122/
DOI:
Bibkey:
Cite (ACL):: Philipp Aichinger, Immer Roesner, Matthias Leonhard, Doris-Maria Denk-Linnert, Wolfgang Bigenzahn, and Berit Schneider-Stickler. 2016. A Database of Laryngeal High-Speed Videos with Simultaneous High-Quality Audio Recordings of Pathological and Non-Pathological Voices. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 767–770, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):: A Database of Laryngeal High-Speed Videos with Simultaneous High-Quality Audio Recordings of Pathological and Non-Pathological Voices (Aichinger et al., LREC 2016)
Copy Citation:
PDF:: https://aclanthology.org/L16-1122.pdf

PDF Cite Search Fix data