Extracting Sign Language Articulation from Videos with MediaPipe

Carl Börstell


Abstract
This paper concerns evaluating methods for extracting phonological information of Swedish Sign Language signs from video data with MediaPipe’s pose estimation. The methods involve estimating i) the articulation phase, ii) hand dominance (left vs. right), iii) the number of hands articulating (one- vs. two-handed signs) and iv) the sign’s place of articulation. The results show that MediaPipe’s tracking of the hands’ location and movement in videos can be used to estimate the articulation phase of signs. Whereas the inclusion of transport movements improves the accuracy for the estimation of hand dominance and number of hands, removing transport movements is crucial for estimating a sign’s place of articulation.
Anthology ID:
2023.nodalida-1.18
Volume:
Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
Month:
May
Year:
2023
Address:
Tórshavn, Faroe Islands
Editors:
Tanel Alumäe, Mark Fishel
Venue:
NoDaLiDa
SIG:
Publisher:
University of Tartu Library
Note:
Pages:
169–178
Language:
URL:
https://aclanthology.org/2023.nodalida-1.18
DOI:
Bibkey:
Cite (ACL):
Carl Börstell. 2023. Extracting Sign Language Articulation from Videos with MediaPipe. In Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), pages 169–178, Tórshavn, Faroe Islands. University of Tartu Library.
Cite (Informal):
Extracting Sign Language Articulation from Videos with MediaPipe (Börstell, NoDaLiDa 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.nodalida-1.18.pdf