2022
pdf
bib
abs
MC-TRISLAN: A Large 3D Motion Capture Sign Language Data-set
Pavel Jedlička
|
Zdeněk Krňoul
|
Milos Zelezny
|
Ludek Muller
Proceedings of the LREC2022 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resources
The new 3D motion capture data corpus expands the portfolio of existing language resources by a corpus of 18 hours of Czech sign language. This helps to alleviate the current problem, which is a critical lack of high quality data necessary for research and subsequent deployment of machine learning techniques in this area. We currently provide the largest collection of annotated sign language recordings acquired by state-of-the-art 3D human body recording technology for the successful future deployment in communication technologies, especially machine translation and sign language synthesis.
2020
pdf
bib
abs
Sign Language Motion Capture Dataset for Data-driven Synthesis
Pavel Jedlička
|
Zdeněk Krňoul
|
Jakub Kanis
|
Miloš Železný
Proceedings of the LREC2020 9th Workshop on the Representation and Processing of Sign Languages: Sign Language Resources in the Service of the Language Community, Technological Challenges and Application Perspectives
This paper presents a new 3D motion capture dataset of Czech Sign Language (CSE). Its main purpose is to provide the data for further analysis and data-based automatic synthesis of CSE utterances. The content of the data in the given limited domain of weather forecasts was carefully selected by the CSE linguists to provide the necessary utterances needed to produce any new weather forecast. The dataset was recorded using the state-of-the-art motion capture (MoCap) technology to provide the most precise trajectories of the motion. In general, MoCap is a device capable of accurate recording of motion directly in 3D space. The data contains trajectories of body, arms, hands and face markers recorded at once to provide consistent data without the need for the time alignment.
2008
pdf
bib
abs
Design and Recording of Czech Audio-Visual Database with Impaired Conditions for Continuous Speech Recognition
Jana Trojanová
|
Marek Hrúz
|
Pavel Campr
|
Miloš Železný
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
In this paper we discuss the design, acquisition and preprocessing of a Czech audio-visual speech corpus. The corpus is intended for training and testing of existing audio-visual speech recognition system. The name of the database is UWB-07-ICAVR, where ICAVR stands for Impaired Condition Audio Visual speech Recognition. The corpus consists of 10,000 utterances of continuous speech obtained from 50 speakers. The total length of the database is 25 hours. Each utterance is stored as a separate sentence. The corpus extends existing databases by covering condition of variable illumination. We acquired 50 speakers, where half of them were men and half of them were women. Recording was done by two cameras and two microphones. Database introduced in this paper can be used for testing of visual parameterization in audio-visual speech recognition (AVSR). Corpus can be easily split into training and testing part. Each speaker pronounced 200 sentences: first 50 were the same for all, the rest of them were different. Six types of illumination were covered. Session for one speaker can fit on one DVD disk. All files are accompanied by visual labels. Labels specify region of interest (mouth and area around them specified by bounding box). Actual pronunciation of each sentence is transcribed into the text file.