Hemant Kumar Kathania


2021

pdf bib
Speaker Verification Experiments for Adults and Children Using Shared Embedding Spaces
Tuomas Kaseva | Hemant Kumar Kathania | Aku Rouhe | Mikko Kurimo
Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)

For children, the system trained on a large corpus of adult speakers performed worse than a system trained on a much smaller corpus of children’s speech. This is due to the acoustic mismatch between training and testing data. To capture more acoustic variability we trained a shared system with mixed data from adults and children. The shared system yields the best EER for children with no degradation for adults. Thus, the single system trained with mixed data is applicable for speaker verification for both adults and children.

pdf bib
Spectral modification for recognition of children’s speech undermismatched conditions
Hemant Kumar Kathania | Sudarsana Reddy Kadiri | Paavo Alku | Mikko Kurimo
Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)

In this paper, we propose spectral modification by sharpening formants and by reducing the spectral tilt to recognize children’s speech by automatic speech recognition (ASR) systems developed using adult speech. In this type of mismatched condition, the ASR performance is degraded due to the acoustic and linguistic mismatch in the attributes between children and adult speakers. The proposed method is used to improve the speech intelligibility to enhance the children’s speech recognition using an acoustic model trained on adult speech. In the experiments, WSJCAM0 and PFSTAR are used as databases for adults’ and children’s speech, respectively. The proposed technique gives a significant improvement in the context of the DNN-HMM-based ASR. Furthermore, we validate the robustness of the technique by showing that it performs well also in mismatched noise conditions.