Finetuning Pre-trained Language Models for Bidirectional Sign Language Gloss to Text Translation

Arshia Kermani; Habib Irani; Vangelis Metsis

Finetuning Pre-trained Language Models for Bidirectional Sign Language Gloss to Text Translation

Arshia Kermani, Habib Irani, Vangelis Metsis

Abstract

Sign Language Translation (SLT) is a crucial technology for fostering communication accessibility for the Deaf and Hard-of-Hearing (DHH) community. A dominant approach in SLT involves a two-stage pipeline: first, transcribing video to sign language glosses, and then translating these glosses into natural text. This second stage, gloss-to-text translation, is a challenging, low-resource machine translation task due to data scarcity and significant syntactic divergence. While prior work has often relied on training translation models from scratch, we show that fine-tuning large, pre-trained language models (PLMs) offers a more effective and data-efficient paradigm. In this work, we conduct a comprehensive bidirectional evaluation of several PLMs (T5, Flan-T5, mBART, and Llama) on this task. We use a collection of popular SLT datasets (RWTH-PHOENIX-14T, SIGNUM, and ASLG-PC12) and evaluate performance using standard machine translation metrics. Our results show that fine-tuned PLMs consistently and significantly outperform Transformer models trained from scratch, establishing new state-of-the-art results. Crucially, our bidirectional analysis reveals a significant performance gap, with Text-to-Gloss translation posing a greater challenge than Gloss-to-Text. We conclude that leveraging the linguistic knowledge of pre-trained models is a superior strategy for gloss translation and provides a more practical foundation for building robust, real-world SLT systems.

Anthology ID:: 2025.wslp-main.11
Volume:: Proceedings of the Workshop on Sign Language Processing (WSLP)
Month:: December
Year:: 2025
Address:: IIT Bombay, Mumbai, India (Co-located with IJCNLP–AACL 2025)
Editors:: Mohammed Hasanuzzaman, Facundo Manuel Quiroga, Ashutosh Modi, Sabyasachi Kamila, Keren Artiaga, Abhinav Joshi, Sanjeet Singh
Venues:: WSLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 73–81
Language:
URL:: https://aclanthology.org/2025.wslp-main.11/
DOI:
Bibkey:
Cite (ACL):: Arshia Kermani, Habib Irani, and Vangelis Metsis. 2025. Finetuning Pre-trained Language Models for Bidirectional Sign Language Gloss to Text Translation. In Proceedings of the Workshop on Sign Language Processing (WSLP), pages 73–81, IIT Bombay, Mumbai, India (Co-located with IJCNLP–AACL 2025). Association for Computational Linguistics.
Cite (Informal):: Finetuning Pre-trained Language Models for Bidirectional Sign Language Gloss to Text Translation (Kermani et al., WSLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.wslp-main.11.pdf

PDF Cite Search Fix data