以三元組損失微調時延神經網路語者嵌入函數之語者辨識系統(Time Delay Neural Network-based Speaker Embedding Function Fine-tuned with Triplet Loss for Distance-based Speaker Recognition)

Chih-Ting Yehn, Po-Chin Wang, Su-Yu Zhang, Chia-Ping Chen, Shan-Wen Hsiao, Bo-Cheng Chan, Chung-li Lu


Anthology ID:
2019.rocling-1.29
Volume:
Proceedings of the 31st Conference on Computational Linguistics and Speech Processing (ROCLING 2019)
Month:
October
Year:
2019
Address:
New Taipei City, Taiwan
Editors:
Chen-Yu Chiag, Min-Yuh Day, Jen-Tzung Chien
Venue:
ROCLING
SIG:
Publisher:
The Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
Note:
Pages:
310–324
Language:
Chinese
URL:
https://aclanthology.org/2019.rocling-1.29
DOI:
Bibkey:
Cite (ACL):
Chih-Ting Yehn, Po-Chin Wang, Su-Yu Zhang, Chia-Ping Chen, Shan-Wen Hsiao, Bo-Cheng Chan, and Chung-li Lu. 2019. 以三元組損失微調時延神經網路語者嵌入函數之語者辨識系統(Time Delay Neural Network-based Speaker Embedding Function Fine-tuned with Triplet Loss for Distance-based Speaker Recognition). In Proceedings of the 31st Conference on Computational Linguistics and Speech Processing (ROCLING 2019), pages 310–324, New Taipei City, Taiwan. The Association for Computational Linguistics and Chinese Language Processing (ACLCLP).
Cite (Informal):
以三元組損失微調時延神經網路語者嵌入函數之語者辨識系統(Time Delay Neural Network-based Speaker Embedding Function Fine-tuned with Triplet Loss for Distance-based Speaker Recognition) (Yehn et al., ROCLING 2019)
Copy Citation:
PDF:
https://aclanthology.org/2019.rocling-1.29.pdf
Data
MUSANVoxCeleb1VoxCeleb2