A Machine Learning-based Segmentation Approach for Measuring Similarity between Sign Languages

Tonni Das Jui, Gissella Bejarano, Pablo Rivas


Abstract
Due to the lack of more variate, native and continuous datasets, sign languages are low-resources languages that can benefit from multilingualism in machine translation. In order to analyze the benefits of approaches like multilingualism, finding the similarity between sign languages can guide better matches and contributions between languages. However, calculating the similarity between sign languages again implies a laborious work to measure how close or distant signs are and their respective contexts. For that reason, we propose to support the similarity measurement between sign languages through a video-segmentation-based machine learning model that will quantify this match among signs of different countries’ sign languages. Using a machine learning approach the similarity measurement process can run more smoothly, compared to a more manual approach. We use a pre-trained temporal segmentation model for British Sign Language (BSL). We test it on three datasets, an American Sign Language (ASL) dataset, an Indian Sign Language (ISL), and an Australian Sign Language (AUSLAN) dataset. We hypothesize that the percentage of segmented and recognized signs by this machine learning model can represent the percentage of overlap or similarity between British and the other three sign languages. In our ongoing work, we evaluate three metrics considering Swadesh’s and Woodward’s list and their synonyms. We found that our intermediate-strict metric coincides with a more classical analysis of the similarity between British and American Sign Language, as well as with the classical low measurement between Indian and British sign languages. On the other hand, our similarity measurement between British and Australian Sign language just holds for part of the Australian Sign Language and not the whole data sample.
Anthology ID:
2022.signlang-1.15
Volume:
Proceedings of the LREC2022 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resources
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Eleni Efthimiou, Stavroula-Evita Fotinea, Thomas Hanke, Julie A. Hochgesang, Jette Kristoffersen, Johanna Mesch, Marc Schulder
Venue:
SignLang
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
94–101
Language:
URL:
https://aclanthology.org/2022.signlang-1.15
DOI:
Bibkey:
Cite (ACL):
Tonni Das Jui, Gissella Bejarano, and Pablo Rivas. 2022. A Machine Learning-based Segmentation Approach for Measuring Similarity between Sign Languages. In Proceedings of the LREC2022 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resources, pages 94–101, Marseille, France. European Language Resources Association.
Cite (Informal):
A Machine Learning-based Segmentation Approach for Measuring Similarity between Sign Languages (Jui et al., SignLang 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.signlang-1.15.pdf