RCRNN-based Sound Event Detection System with Specific Speech Resolution

Sung-Jen Huang, Yih-Wen Wang, Chia-Ping Chen, Chung-Li Lu, Bo-Cheng Chan


Abstract
Sound event detection (SED) system outputs sound events and their time boundaries in audio signals. We proposed an RCRNN-based SED system with residual connection and convolution block attention mechanism based on the mean-teacher framework of semi-supervised learning. The neural network can be trained with an amount of weakly labeled data and unlabeled data. In addition, we consider that the speech event has more information than other sound events. Thus, we use the specific time-frequency resolution to extract the acoustic feature of the speech event. Furthermore, we apply data augmentation and post-processing to improve the performance. On the DCASE 2021 Task 4 validation set, the proposed system achieves the PSDS (Poly-phonic Sound Event Detection Score)-scenario 2 of 57.6% and event-based F1-score of 41.6%, outperforming the baseline score of 52.7% and 40.7%.
Anthology ID:
2021.rocling-1.16
Volume:
Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing (ROCLING 2021)
Month:
October
Year:
2021
Address:
Taoyuan, Taiwan
Editors:
Lung-Hao Lee, Chia-Hui Chang, Kuan-Yu Chen
Venue:
ROCLING
SIG:
Publisher:
The Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
Note:
Pages:
118–123
Language:
URL:
https://aclanthology.org/2021.rocling-1.16
DOI:
Bibkey:
Cite (ACL):
Sung-Jen Huang, Yih-Wen Wang, Chia-Ping Chen, Chung-Li Lu, and Bo-Cheng Chan. 2021. RCRNN-based Sound Event Detection System with Specific Speech Resolution. In Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing (ROCLING 2021), pages 118–123, Taoyuan, Taiwan. The Association for Computational Linguistics and Chinese Language Processing (ACLCLP).
Cite (Informal):
RCRNN-based Sound Event Detection System with Specific Speech Resolution (Huang et al., ROCLING 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.rocling-1.16.pdf