Memory-Efficient Training for Text-Dependent SV with Independent Pre-trained Models

Seyed Ali Farokh; Hossein Zeinali

Memory-Efficient Training for Text-Dependent SV with Independent Pre-trained Models

Abstract

This paper presents our submission to the Iranian division of the Text-Dependent Speaker Verification Challenge (TdSV) 2024. Conventional TdSV approaches typically jointly model speaker and linguistic features, requiring unsegmented inputs during training and incurring high computational costs. Additionally, these methods often fine-tune large-scale pre-trained speaker embedding models on the target domain dataset, which may compromise the pre-trained models’ original ability to capture speaker-specific characteristics. To overcome these limitations, we employ a TdSV system that utilizes two pre-trained models independently and demonstrate that, by leveraging pre-trained models with targeted domain adaptation, competitive results can be achieved while avoiding the substantial computational costs associated with joint fine-tuning on unsegmented inputs in conventional approaches. Our best system reached a MinDCF of 0.0358 on the evaluation subset and secured first place in the challenge.

Anthology ID:: 2025.rocling-main.11
Volume:: Proceedings of the 37th Conference on Computational Linguistics and Speech Processing (ROCLING 2025)
Month:: November
Year:: 2025
Address:: National Taiwan University, Taipei City, Taiwan
Editors:: Kai-Wei Chang, Ke-Han Lu, Chih-Kai Yang, Zhi-Rui Tam, Wen-Yu Chang, Chung-Che Wang
Venue:: ROCLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 95–102
Language:
URL:: https://aclanthology.org/2025.rocling-main.11/
DOI:
Bibkey:
Cite (ACL):: Seyed Ali Farokh and Hossein Zeinali. 2025. Memory-Efficient Training for Text-Dependent SV with Independent Pre-trained Models. In Proceedings of the 37th Conference on Computational Linguistics and Speech Processing (ROCLING 2025), pages 95–102, National Taiwan University, Taipei City, Taiwan. Association for Computational Linguistics.
Cite (Informal):: Memory-Efficient Training for Text-Dependent SV with Independent Pre-trained Models (Farokh & Zeinali, ROCLING 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.rocling-main.11.pdf

PDF Cite Search Fix data