PeruSIL: A Framework to Build a Continuous Peruvian Sign Language Interpretation Dataset

Gissella Bejarano, Joe Huamani-Malca, Francisco Cerna-Herrera, Fernando Alva-Manchego, Pablo Rivas


Abstract
Video-based datasets for Continuous Sign Language are scarce due to the challenging task of recording videos from native signers and the reduced number of people who can annotate sign language. COVID-19 has evidenced the key role of sign language interpreters in delivering nationwide health messages to deaf communities. In this paper, we present a framework for creating a multi-modal sign language interpretation dataset based on videos and we use it to create the first dataset for Peruvian Sign Language (LSP) interpretation annotated by hearing volunteers who have intermediate knowledge of PSL guided by the video audio. We rely on hearing people to produce a first version of the annotations, which should be reviewed by native signers in the future. Our contributions: i) we design a framework to annotate a sign Language dataset; ii) we release the first annotated LSP multi-modal interpretation dataset (AEC); iii) we evaluate the annotation done by hearing people by training a sign language recognition model. Our model reaches up to 80.3% of accuracy among a minimum of five classes (signs) AEC dataset, and 52.4% in a second dataset. Nevertheless, analysis by subject in the second dataset show variations worth to discuss.
Anthology ID:
2022.signlang-1.1
Volume:
Proceedings of the LREC2022 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resources
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Eleni Efthimiou, Stavroula-Evita Fotinea, Thomas Hanke, Julie A. Hochgesang, Jette Kristoffersen, Johanna Mesch, Marc Schulder
Venue:
SignLang
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
1–8
Language:
URL:
https://aclanthology.org/2022.signlang-1.1
DOI:
Bibkey:
Cite (ACL):
Gissella Bejarano, Joe Huamani-Malca, Francisco Cerna-Herrera, Fernando Alva-Manchego, and Pablo Rivas. 2022. PeruSIL: A Framework to Build a Continuous Peruvian Sign Language Interpretation Dataset. In Proceedings of the LREC2022 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resources, pages 1–8, Marseille, France. European Language Resources Association.
Cite (Informal):
PeruSIL: A Framework to Build a Continuous Peruvian Sign Language Interpretation Dataset (Bejarano et al., SignLang 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.signlang-1.1.pdf