Chirp Group Delay based Feature for Speech Applications

Malarvizhi Muthuramalingam, Anushiya Rachel Gladston, P Vijayalakshmi, T Nagarajan


Abstract
Conventional Fast Fourier Transform (FFT),computed on the unit circle, gives an accurate representation of the spectrum if the signal under consideration is because of the sustained oscillations. However, practical signals are not sustained oscillations. For the signals that are either decaying/growing along time, the phase spectrum computed using conventional FFT is not accurate, and in turn, the magnitude spectrum too. Hence a feature, based on a variant of the group delay spectrum, namely the chirp group delay (CGD) spectrum, is proposed. The efficacy of the proposed feature is evaluated in Gaussian Mixture Model (GMM) and Convolutional Neural Network (CNN)-based speaker identification systems. Analysis reveals a significant increase in performance when using the CGD-based feature over the magnitude spectrum.
Anthology ID:
2024.icon-1.52
Volume:
Proceedings of the 21st International Conference on Natural Language Processing (ICON)
Month:
December
Year:
2024
Address:
AU-KBC Research Centre, Chennai, India
Editors:
Sobha Lalitha Devi, Karunesh Arora
Venue:
ICON
SIG:
Publisher:
NLP Association of India (NLPAI)
Note:
Pages:
449–453
Language:
URL:
https://aclanthology.org/2024.icon-1.52/
DOI:
Bibkey:
Cite (ACL):
Malarvizhi Muthuramalingam, Anushiya Rachel Gladston, P Vijayalakshmi, and T Nagarajan. 2024. Chirp Group Delay based Feature for Speech Applications. In Proceedings of the 21st International Conference on Natural Language Processing (ICON), pages 449–453, AU-KBC Research Centre, Chennai, India. NLP Association of India (NLPAI).
Cite (Informal):
Chirp Group Delay based Feature for Speech Applications (Muthuramalingam et al., ICON 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.icon-1.52.pdf