Leveraging Latent Representations of Speech for Indian Language Identification

Samarjit Karmakar; P Radha Krishna

Leveraging Latent Representations of Speech for Indian Language Identification

Abstract

Identification of the language spoken from speech utterances is an interesting task because of the diversity associated with different languages and human voices. Indian languages have diverse origins and identifying them from speech utterances would help several language recognition, translation and relationship mining tasks. The current approaches for tackling the problem of languages identification in the Indian context heavily use feature engineering and classical speech processing techniques. This is a bottleneck for language identification systems, as we require to exploit necessary features in speech, required for machine identification, which are learnt by a probabilistic framework, rather than handcrafted feature engineering. In this paper, we tackle the problem of language identification using latent representations learnt from speech using Variational Autoencoders (VAEs) and leverage the representations learnt to train sequence models. Our framework attains an accuracy of 89% in the identification of 8 well known Indian languages (namely Tamil, Telugu, Punjabi, Marathi, Gujarati, Hindi, Kannada and Bengali) from the CMU Indic Speech Database. The presented approach can be applied to several scenarios for speech processing by employing representation learning and leveraging them for sequence models.

Anthology ID:: 2020.icon-main.45
Volume:: Proceedings of the 17th International Conference on Natural Language Processing (ICON)
Month:: December
Year:: 2020
Address:: Indian Institute of Technology Patna, Patna, India
Editors:: Pushpak Bhattacharyya, Dipti Misra Sharma, Rajeev Sangal
Venue:: ICON
SIG:
Publisher:: NLP Association of India (NLPAI)
Note:
Pages:: 334–340
Language:
URL:: https://aclanthology.org/2020.icon-main.45/
DOI:
Bibkey:
Cite (ACL):: Samarjit Karmakar and P Radha Krishna. 2020. Leveraging Latent Representations of Speech for Indian Language Identification. In Proceedings of the 17th International Conference on Natural Language Processing (ICON), pages 334–340, Indian Institute of Technology Patna, Patna, India. NLP Association of India (NLPAI).
Cite (Informal):: Leveraging Latent Representations of Speech for Indian Language Identification (Karmakar & Krishna, ICON 2020)
Copy Citation:
PDF:: https://aclanthology.org/2020.icon-main.45.pdf

PDF Cite Search Fix data