Yu-Han Cheng


2022

pdf bib
探討語者驗證系統中特徵處理模組與注意力機制 (Investigation of Feature Processing Modules and Attention Mechanisms in Speaker Verification System) [In Chinese]
Ting-Wei Chen | Wei-Ting Lin | Chia-Ping Chen | Chung-Li Lu | Bo-Cheng Chan | Yu-Han Cheng | Hsiang-Feng Chuang | Wei-Yu Chen
International Journal of Computational Linguistics & Chinese Language Processing, Volume 27, Number 2, December 2022

pdf bib
Investigation of feature processing modules and attention mechanisms in speaker verification system
Ting-Wei Chen | Wei-Ting Lin | Chia-Ping Chen | Chung-Li Lu | Bo-Cheng Chan | Yu-Han Cheng | Hsiang-Feng Chuang | Wei-Yu Chen
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)

In this paper, we use several combinations of feature front-end modules and attention mechanisms to improve the performance of our speaker verification system. An updated version of ECAPA-TDNN is chosen as a baseline. We replace and integrate different feature front-end and attention mechanism modules to compare and find the most effective model design, and this model would be our final system. We use VoxCeleb 2 dataset as our training set, and test the performance of our models on several test sets. With our final proposed model, we improved performance by 16% over baseline on VoxSRC2022 valudation set, achieving better results for our speaker verification system.

pdf bib
Lightweight Sound Event Detection Model with RepVGG Architecture
Chia-Chuan Liu | Sung-Jen Huang | Chia-Ping Chen | Chung-Li Lu | Bo-Cheng Chan | Yu-Han Cheng | Hsiang-Feng Chuang | Wei-Yu Chen
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)

In this paper, we proposed RepVGGRNN, which is a light weight sound event detection model. We use RepVGG convolution blocks in the convolution part to improve performance, and re-parameterize the RepVGG blocks after the model is trained to reduce the parameters of the convolution layers. To further improve the accuracy of the model, we incorporated both the mean teacher method and knowledge distillation to train the lightweight model. The proposed system achieves PSDS (Polyphonic sound event detection score)-scenario 1, 2 of 40.8% and 67.7% outperforms the baseline system of 34.4% and 57.2% on the DCASE 2022 Task4 validation dataset. The quantity of the parameters in the proposed system is about 49.6K, only 44.6% of the baseline system.

2013

pdf bib
Clausal-Packaging of Path of Motion in Second Language Acquisition of Russian and Spanish
Kawai Chui | Hsiang-lin Yeh | Wen-Chun Lan | Yu-Han Cheng
Proceedings of the 27th Pacific Asia Conference on Language, Information, and Computation (PACLIC 27)