2022
pdf
bib
abs
A Study on Using Different Audio Lengths in Transfer Learning for Improving Chainsaw Sound Recognition
Jia-Wei Chang
|
Zhong-Yun Hu
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)
Chainsaw sound recognition is a challenging task because of the complexity of sound and the excessive noises in mountain environments. This study aims to discuss the influence of different sound lengths on the accuracy of model training. Therefore, this study used LeNet, a simple model with few parameters, and adopted the design of average pooling to enable the proposed models to receive audio of any length. In performance comparison, we mainly compared the influence of different audio lengths and further tested the transfer learning from short-to-long and long-to-short audio. In experiments, we used the ESC-10 dataset for training models and validated their performance via the self-collected chainsaw-audio dataset. The experimental results show that (a) the models trained with different audio lengths (1s, 3s, and 5s) have accuracy from 74% 78%, 74% 77%, and 79% 83% on the self-collected dataset. (b) The generalization of the previous models is significantly improved by transfer learning, the models achieved 85.28%, 88.67%, and 91.8% of accuracy. (c) In transfer learning, the model learned from short-to-long audios can achieve better results than that learned from long-to-short audios, especially being differed 14% of accuracy on 5s chainsaw-audios.
2021
pdf
bib
以遷移學習改善深度神經網路模型於中文歌詞情緒辨識 (Using Transfer Learning to Improve Deep Neural Networks for Lyrics Emotion Recognition in Chinese)
Jia-Yi Liao
|
Ya-Hsuan Lin
|
Kuan-Cheng Lin
|
Jia-Wei Chang
International Journal of Computational Linguistics & Chinese Language Processing, Volume 26, Number 2, December 2021
pdf
bib
abs
A Study on Using Transfer Learning to Improve BERT Model for Emotional Classification of Chinese Lyrics
Jia-Yi Liao
|
Ya-Hsuan Lin
|
Kuan-Cheng Lin
|
Jia-Wei Chang
Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing (ROCLING 2021)
The explosive growth of music libraries has made music information retrieval and recommendation a critical issue. Recommendation systems based on music emotion recognition are gradually gaining attention. Most of the studies focus on audio data rather than lyrics to build models of music emotion classification. In addition, because of the richness of English language resources, most of the existing studies are focused on English lyrics but rarely on Chinese. For this reason, We propose an approach that uses the BERT pretraining model and Transfer learning to improve the emotion classification task of Chinese lyrics. The following approaches were used without any specific training for the Chinese lyrics emotional classification task: (a) Using BERT, only can reach 50% of the classification accuracy. (b) Using BERT with transfer learning of CVAW, CVAP, and CVAT datasets can achieve 71% classification accuracy.
2013
pdf
bib
基於音段式LMR 對映之語音轉換方法的改進 (Improving of Segmental LMR-Mapping Based Voice Conversion Methods) [In Chinese]
Hung-Yan Gu
|
Jia-Wei Chang
Proceedings of the 25th Conference on Computational Linguistics and Speech Processing (ROCLING 2013)
pdf
bib
基於音段式LMR對映之語音轉換方法的改進 (Improving of Segmental LMR-Mapping Based Voice Conversion Method) [In Chinese]
Hung-Yan Gu
|
Jia-Wei Chang
International Journal of Computational Linguistics & Chinese Language Processing, Volume 18, Number 4, December 2013-Special Issue on Selected Papers from ROCLING XXV
2012
pdf
bib
以線性多變量迴歸來對映分段後音框之語音轉換方法 (A Voice Conversion Method Mapping Segmented Frames with Linear Multivariate Regression) [In Chinese]
Hung-Yan Gu
|
Jia-Wei Chang
|
Zan-Wei Wang
Proceedings of the 24th Conference on Computational Linguistics and Speech Processing (ROCLING 2012)