Mathieu De Coster


pdf bib
Challenges with Sign Language Datasets for Sign Language Recognition and Translation
Mirella De Sisto | Vincent Vandeghinste | Santiago Egea Gómez | Mathieu De Coster | Dimitar Shterionov | Horacio Saggion
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Sign Languages (SLs) are the primary means of communication for at least half a million people in Europe alone. However, the development of SL recognition and translation tools is slowed down by a series of obstacles concerning resource scarcity and standardization issues in the available data. The former challenge relates to the volume of data available for machine learning as well as the time required to collect and process new data. The latter obstacle is linked to the variety of the data, i.e., annotation formats are not unified and vary amongst different resources. The available data formats are often not suitable for machine learning, obstructing the provision of automatic tools based on neural models. In the present paper, we give an overview of these challenges by comparing various SL corpora and SL machine learning datasets. Furthermore, we propose a framework to address the lack of standardization at format level, unify the available resources and facilitate SL research for different languages. Our framework takes ELAN files as inputs and returns textual and visual data ready to train SL recognition and translation models. We present a proof of concept, training neural translation models on the data produced by the proposed framework.

pdf bib
Sign Language Translation: Ongoing Development, Challenges and Innovations in the SignON Project
Dimitar Shterionov | Mirella De Sisto | Vincent Vandeghinste | Aoife Brady | Mathieu De Coster | Lorraine Leeson | Josep Blat | Frankie Picron | Marcello Paolo Scipioni | Aditya Parikh | Louis ten Bosh | John O’Flaherty | Joni Dambre | Jorn Rijckaert
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation

The SignON project ( focuses on the research and development of a Sign Language (SL) translation mobile application and an open communications framework. SignON rectifies the lack of technology and services for the automatic translation between signed and spoken languages, through an inclusive, humancentric solution which facilitates communication between deaf, hard of hearing (DHH) and hearing individuals. We present an overview of the current status of the project, describing the milestones reached to date and the approaches that are being developed to address the challenges and peculiarities of Sign Language Machine Translation (SLMT).


pdf bib
Frozen Pretrained Transformers for Neural Sign Language Translation
Mathieu De Coster | Karel D’Oosterlinck | Marija Pizurica | Paloma Rabaey | Severine Verlinden | Mieke Van Herreweghe | Joni Dambre
Proceedings of the 1st International Workshop on Automatic Translation for Signed and Spoken Languages (AT4SSL)

One of the major challenges in sign language translation from a sign language to a spoken language is the lack of parallel corpora. Recent works have achieved promising results on the RWTH-PHOENIX-Weather 2014T dataset, which consists of over eight thousand parallel sentences between German sign language and German. However, from the perspective of neural machine translation, this is still a tiny dataset. To improve the performance of models trained on small datasets, transfer learning can be used. While this has been previously applied in sign language translation for feature extraction, to the best of our knowledge, pretrained language models have not yet been investigated. We use pretrained BERT-base and mBART-50 models to initialize our sign language video to spoken language text translation model. To mitigate overfitting, we apply the frozen pretrained transformer technique: we freeze the majority of parameters during training. Using a pretrained BERT model, we outperform a baseline trained from scratch by 1 to 2 BLEU-4. Our results show that pretrained language models can be used to improve sign language translation performance and that the self-attention patterns in BERT transfer in zero-shot to the encoder and decoder of sign language translation models.


pdf bib
Sign Language Recognition with Transformer Networks
Mathieu De Coster | Mieke Van Herreweghe | Joni Dambre
Proceedings of the Twelfth Language Resources and Evaluation Conference

Sign languages are complex languages. Research into them is ongoing, supported by large video corpora of which only small parts are annotated. Sign language recognition can be used to speed up the annotation process of these corpora, in order to aid research into sign languages and sign language recognition. Previous research has approached sign language recognition in various ways, using feature extraction techniques or end-to-end deep learning. In this work, we apply a combination of feature extraction using OpenPose for human keypoint estimation and end-to-end feature learning with Convolutional Neural Networks. The proven multi-head attention mechanism used in transformers is applied to recognize isolated signs in the Flemish Sign Language corpus. Our proposed method significantly outperforms the previous state of the art of sign language recognition on the Flemish Sign Language corpus: we obtain an accuracy of 74.7% on a vocabulary of 100 classes. Our results will be implemented as a suggestion system for sign language corpus annotation.