Christoph Draxler

2024

pdf bib abs
Speech Technology Services for Oral History Research
Christoph Draxler | Henk van den Heuvel | Arjan van Hessen | Pavel Ircing | Jan Lehečka
Proceedings of the First Workshop on Holocaust Testimonies as Language Resources (HTRes) @ LREC-COLING 2024

Oral history is about oral sources of witnesses and commentors on historical events. Speech technology is an important instrument to process such recordings in order to obtain transcription and further enhancements to structure the oral account In this contribution we address the transcription portal and the webservices associated with speech processing at BAS, speech solutions developed at LINDAT, how to do it yourself with Whisper, remaining challenges, and future developments.

2020

pdf bib abs
Building a Time-Aligned Cross-Linguistic Reference Corpus from Language Documentation Data (DoReCo)
Ludger Paschen | François Delafontaine | Christoph Draxler | Susanne Fuchs | Matthew Stave | Frank Seifart
Proceedings of the Twelfth Language Resources and Evaluation Conference

Natural speech data on many languages have been collected by language documentation projects aiming to preserve lingustic and cultural traditions in audivisual records. These data hold great potential for large-scale cross-linguistic research into phonetics and language processing. Major obstacles to utilizing such data for typological studies include the non-homogenous nature of file formats and annotation conventions found both across and within archived collections. Moreover, time-aligned audio transcriptions are typically only available at the level of broad (multi-word) phrases but not at the word and segment levels. We report on solutions developed for these issues within the DoReCo (DOcumentation REference COrpus) project. DoReCo aims at providing time-aligned transcriptions for at least 50 collections of under-resourced languages. This paper gives a preliminary overview of the current state of the project and details our workflow, in particular standardization of formats and conventions, the addition of segmental alignments with WebMAUS, and DoReCo’s applicability for subsequent research programs. By making the data accessible to the scientific community, DoReCo is designed to bridge the gap between language documentation and linguistic inquiry.

pdf bib abs
A CLARIN Transcription Portal for Interview Data
Christoph Draxler | Henk van den Heuvel | Arjan van Hessen | Silvia Calamai | Louise Corti
Proceedings of the Twelfth Language Resources and Evaluation Conference

In this paper we present a first version of a transcription portal for audio files based on automatic speech recognition (ASR) in various languages. The portal is implemented in the CLARIN resources research network and intended for use by non-technical scholars. We explain the background and interdisciplinary nature of interview data, the perks and quirks of using ASR for transcribing the audio in a research context, the dos and don’ts for optimal use of the portal, and future developments foreseen. The portal is promoted in a range of workshops, but there are a number of challenges that have to be met. These challenges concern privacy issues, ASR quality, and cost, amongst others.

2016

pdf bib abs
The BAS Speech Data Repository
Uwe Reichel | Florian Schiel | Thomas Kisler | Christoph Draxler | Nina Pörner
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The BAS CLARIN speech data repository is introduced. At the current state it comprises 31 pre-dominantly German corpora of spoken language. It is compliant to the CLARIN-D as well as the OLAC requirements. This enables its embedding into several infrastructures. We give an overview over its structure, its implementation as well as the corpora it contains.

pdf bib abs
BAS Speech Science Web Services - an Update of Current Developments
Thomas Kisler | Uwe Reichel | Florian Schiel | Christoph Draxler | Bernhard Jackl | Nina Pörner
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In 2012 the Bavarian Archive for Speech Signals started providing some of its tools from the field of spoken language in the form of Software as a Service (SaaS). This means users access the processing functionality over a web browser and therefore do not have to install complex software packages on a local computer. Amongst others, these tools include segmentation & labeling, grapheme-to-phoneme conversion, text alignment, syllabification and metadata generation, where all but the last are available for a variety of languages. Since its creation the number of available services and the web interface have changed considerably. We give an overview and a detailed description of the system architecture, the available web services and their functionality. Furthermore, we show how the number of files processed over the system developed in the last four years.

2014

pdf bib abs
Online experiments with the Percy software framework - experiences and some early results
Christoph Draxler
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In early 2012 the online perception experiment software Percy was deployed on a production server at our lab. Since then, 38 experiments have been made publicly available, with a total of 3078 experiment sessions. In the course of time, the software has been continuously updated and extended to adapt to changing user requirements. Web-based editors for the structure and layout of the experiments have been developed. This paper describes the system architecture, presents usage statistics, discusses typical characteristics of online experiments, and gives an outlook on ongoing work. webapp.phonetik.uni-muenchen.de/WebExperiment lists all currently active experiments.

2008

pdf bib abs
F0 of Adolescent Speakers - First Results for the German Ph@ttSessionz Database
Christoph Draxler | Florian Schiel | Tania Ellbogen
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

The first release of the German Ph@ttSessionz speech database contains read and spontaneous speech from 864 adolescent speakers and is the largest database of its kind for German. It was recorded via the WWW in over 40 public schools in all dialect regions of Germany. In this paper, we present a cross-sectional study of f0 measurements on this database. The study documents the profound changes in male voices at the age 13-15. Furthermore, it shows that on a perceptive mel-scale, there is little difference in the relative f0 variability for male and female speakers. A closer analysis reveals that f0 variability is dependent on the speech style and both the length and the type of the utterance. The study provides statistically reliable voice parameters of adolescent speakers for German. The results may contribute to making spoken dialog systems more robust by restricting user input to utterances with low f0 variability.

2006

pdf bib abs
Speech Recordings in Public Schools in Germany - the Perfect Show Case for Web-based Recordings and Annotation
Christoph Draxler | Klaus Jänsch
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

In the Ph@ttSessionz project, geographically distributed high-bandwidth recordings of adolescent speakers are performed in public schools all over Germany. To achieve a consistent technical signal quality, a standard configuration of recording equipment is sent to the participating schools. The recordings are made using the SpeechRecorder software for prompted speech recordings via the WWW. During a recording session, prompts are downloaded from a server, and the speech data is uploaded to the server in a background process. This paper focuses on the technical aspects of the distributed Ph@ttSessionz speech recordings and their annotation.