2022
pdf
bib
探討語者驗證系統中特徵處理模組與注意力機制 (Investigation of Feature Processing Modules and Attention Mechanisms in Speaker Verification System) [In Chinese]
Ting-Wei Chen
|
Wei-Ting Lin
|
Chia-Ping Chen
|
Chung-Li Lu
|
Bo-Cheng Chan
|
Yu-Han Cheng
|
Hsiang-Feng Chuang
|
Wei-Yu Chen
International Journal of Computational Linguistics & Chinese Language Processing, Volume 27, Number 2, December 2022
pdf
bib
abs
Investigation of feature processing modules and attention mechanisms in speaker verification system
Ting-Wei Chen
|
Wei-Ting Lin
|
Chia-Ping Chen
|
Chung-Li Lu
|
Bo-Cheng Chan
|
Yu-Han Cheng
|
Hsiang-Feng Chuang
|
Wei-Yu Chen
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)
In this paper, we use several combinations of feature front-end modules and attention mechanisms to improve the performance of our speaker verification system. An updated version of ECAPA-TDNN is chosen as a baseline. We replace and integrate different feature front-end and attention mechanism modules to compare and find the most effective model design, and this model would be our final system. We use VoxCeleb 2 dataset as our training set, and test the performance of our models on several test sets. With our final proposed model, we improved performance by 16% over baseline on VoxSRC2022 valudation set, achieving better results for our speaker verification system.
2021
pdf
bib
abs
Discussion on domain generalization in the cross-device speaker verification system
Wei-Ting Lin
|
Yu-Jia Zhang
|
Chia-Ping Chen
|
Chung-Li Lu
|
Bo-Cheng Chan
Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing (ROCLING 2021)
In this paper, we use domain generalization to improve the performance of the cross-device speaker verification system. Based on a trainable speaker verification system, we use domain generalization algorithms to fine-tune the model parameters. First, we use the VoxCeleb2 dataset to train ECAPA-TDNN as a baseline model. Then, use the CHT-TDSV dataset and the following domain generalization algorithms to fine-tune it: DANN, CDNN, Deep CORAL. Our proposed system tests 10 different scenarios in the NSYSU-TDSV dataset, including a single device and multiple devices. Finally, in the scenario of multiple devices, the best equal error rate decreased from 18.39 in the baseline to 8.84. Successfully achieved cross-device identification on the speaker verification system.
2020
pdf
bib
Exploring Disparate Language Model Combination Strategies for Mandarin-English Code-Switching ASR
Wei-Ting Lin
|
Berlin Chen
Proceedings of the 32nd Conference on Computational Linguistics and Speech Processing (ROCLING 2020)