Priyadharshini M.k
2024
Severity Classification and Dysarthric Speech Detection using Self-Supervised Representations
Sanjay B
|
Priyadharshini M.k
|
Vijayalakshmi P
|
Nagarajan T
Proceedings of the 21st International Conference on Natural Language Processing (ICON)
Automatic detection and classification of dysarthria severity from speech provides a non-invasive and efficient diagnostic tool, offering clinicians valuable insights to guide treatment and therapy decisions. Our study evaluated two pre-trained models—wav2vec2-BASE and distilALHuBERT, for feature extraction to build speech detection and severity-level classification systems for dysarthric speech. We conducted experiments on the TDSC dataset using two approaches: a machine learning model (support vector machine, SVM) and a deep learning model (convolutional neural network, CNN). Our findings showed that features derived from distilALHuBERT significantly outperformed those from wav2vec2-BASE in both dysarthric speech detection and severity classification tasks. Notably, the distilALHuBERT features achieved 99% accuracy in automatic detection and 95% accuracy in severity classification, surpassing the performance of wav2vec2 features.