Rosetta-LSF: an Aligned Corpus of French Sign Language and French for Text-to-Sign Translation

Elise Bertin-Lemée, Annelies Braffort, Camille Challant, Claire Danet, Boris Dauriac, Michael Filhol, Emmanuella Martinod, Jérémie Segouat


Abstract
This article presents a new French Sign Language (LSF) corpus called “Rosetta-LSF”. It was created to support future studies on the automatic translation of written French into LSF, rendered through the animation of a virtual signer. An overview of the field highlights the importance of a quality representation of LSF. In order to obtain quality animations understandable by signers, it must surpass the simple “gloss transcription” of the LSF lexical units to use in the discourse. To achieve this, we designed a corpus composed of four types of aligned data, and evaluated its usability. These are: news headlines in French, translations of these headlines into LSF in the form of videos showing animations of a virtual signer, gloss annotations of the “traditional” type—although including additional information on the context in which each gestural unit is performed as well as their potential for adaptation to another context—and AZee representations of the videos, i.e. formal expressions capturing the necessary and sufficient linguistic information. This article describes this data, exhibiting an example from the corpus. It is available online for public research.
Anthology ID:
2022.lrec-1.529
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
4955–4962
Language:
URL:
https://aclanthology.org/2022.lrec-1.529
DOI:
Bibkey:
Cite (ACL):
Elise Bertin-Lemée, Annelies Braffort, Camille Challant, Claire Danet, Boris Dauriac, Michael Filhol, Emmanuella Martinod, and Jérémie Segouat. 2022. Rosetta-LSF: an Aligned Corpus of French Sign Language and French for Text-to-Sign Translation. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 4955–4962, Marseille, France. European Language Resources Association.
Cite (Informal):
Rosetta-LSF: an Aligned Corpus of French Sign Language and French for Text-to-Sign Translation (Bertin-Lemée et al., LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.529.pdf