Dynamic Feature Fusion for Sign Language Translation Using HyperNetworks

Ruiquan Zhang; Rui Zhao; Zhicong Wu; Liang Zhang; Haoqi Zhang; Yidong Chen

doi:10.18653/v1/2025.findings-naacl.348

Dynamic Feature Fusion for Sign Language Translation Using HyperNetworks

Ruiquan Zhang, Rui Zhao, Zhicong Wu, Liang Zhang, Haoqi Zhang, Yidong Chen

Abstract

This paper presents an efficient dual-stream early fusion method for sign language translation. Inspired by the brain’s ability to process color, shape, and motion simultaneously, the method explores complex dependencies between RGB and keypoint streams, improving speed and efficiency. A key challenge is extracting complementary features from both streams while ensuring global semantic consistency to avoid conflicts and improve generalization. To address this issue, we propose a hypernetwork-based fusion strategy that effectively extracts salient features from RGB and keypoint streams, alongside a partial shortcut connection training method to strengthen the complementary information between the dual streams. Additionally, we introduce self-distillation and SST contrastive learning to maintain feature advantages while aligning the global semantic space. Experiments show that our method achieves state-of-the-art performance on two public sign language datasets, reducing model parameters by about two-thirds.

Anthology ID:: 2025.findings-naacl.348
Volume:: Findings of the Association for Computational Linguistics: NAACL 2025
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6242–6254
Language:
URL:: https://aclanthology.org/2025.findings-naacl.348/
DOI:: 10.18653/v1/2025.findings-naacl.348
Bibkey:
Cite (ACL):: Ruiquan Zhang, Rui Zhao, Zhicong Wu, Liang Zhang, Haoqi Zhang, and Yidong Chen. 2025. Dynamic Feature Fusion for Sign Language Translation Using HyperNetworks. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 6242–6254, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: Dynamic Feature Fusion for Sign Language Translation Using HyperNetworks (Zhang et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-naacl.348.pdf

PDF Cite Search Fix data