SiLVERScore: Semantically-Aware Embeddings for Sign Language Generation Evaluation

Saki Imai, Mert Inan, Anthony B. Sicilia, Malihe Alikhani


Abstract
Evaluating sign language generation is often done through back-translation, where generated signs are first recognized back to text and then compared to a reference using text-based metrics. However, this two-step evaluation pipeline introduces ambiguity: it not only fails to capture the multimodal nature of sign language—such as facial expressions, spatial grammar, and prosody—but also makes it hard to pinpoint whether evaluation errors come from sign generation model or the translation system used to assess it. In this work, we propose SiLVERScore, a novel semantically-aware embedding-based evaluation metric that assesses sign language generation in a joint embedding space. Our contributions include: (1) identifying limitations of existing metrics, (2) introducing SiLVERScore for semantically-aware evaluation, (3) demonstrating its robustness to semantic and prosodic variations, and (4) exploring generalization challenges across datasets. On PHOENIX-14T and CSL-Daily datasets, SiLVERScore achieves near-perfect discrimination between correct and random pairs (ROC AUC = 0.99, overlap < 7%), substantially outperforming traditional metrics.
Anthology ID:
2025.ranlp-1.54
Volume:
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Month:
September
Year:
2025
Address:
Varna, Bulgaria
Editors:
Galia Angelova, Maria Kunilovskaya, Marie Escribe, Ruslan Mitkov
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
452–461
Language:
URL:
https://aclanthology.org/2025.ranlp-1.54/
DOI:
Bibkey:
Cite (ACL):
Saki Imai, Mert Inan, Anthony B. Sicilia, and Malihe Alikhani. 2025. SiLVERScore: Semantically-Aware Embeddings for Sign Language Generation Evaluation. In Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era, pages 452–461, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
SiLVERScore: Semantically-Aware Embeddings for Sign Language Generation Evaluation (Imai et al., RANLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.ranlp-1.54.pdf