S. R. Mahadeva Prasanna

Also published as: S R Mahadeva Prasanna


2024

pdf bib
Evaluating the Efficacy of Large Acoustic Model for Documenting Non-Orthographic Tribal Languages in India
Tonmoy Rajkhowa | Amartya Roy Chowdhury | Hrishikesh Ravindra Karande | S. R. Mahadeva Prasanna
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Pre-trained Large Acoustic Models, when fine-tuned, have largely shown to improve the performances in various tasks related to spoken language technologies. However, their evaluation has been mostly on datasets that contain English or other widely spoken languages, and their potential for novel under-resourced languages is not fully known. In this work, four novel under-resourced tribal languages that do not have a standard writing system were introduced and the application of such large pre-trained models was assessed to document such languages using Automatic Speech Recognition and Direct Speech-to-Text Translation systems. The transcriptions for these tribal languages were generated by adapting scripts from those languages that held a prominent presence in the geographical regions where these tribal languages are spoken. The results from this study suggest a viable direction to document these languages in the electronic domain by using Spoken Language Technologies that incorporate LAMs. Additionally, this study helped in understanding the varying performances exhibited by the Large Acoustic Model between these four languages. This study not only informs the adoption of appropriate scripts for transliterating spoken-only languages based on the language family but also aids in making informed decisions in analyzing the behavior of particular Large Acoustic Model in linguistic contexts.

2022

pdf bib
Machine Translation for a Very Low-Resource Language - Layer Freezing Approach on Transfer Learning
Amartya Chowdhury | Deepak K. T. | Samudra Vijaya K | S. R. Mahadeva Prasanna
Proceedings of the Fifth Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2022)

This paper presents the implementation of Machine Translation (MT) between Lambani, a low-resource Indian tribal language, and English, a high-resource universal language. Lambani is spoken by nomadic tribes of the Indian state of Karnataka and there are similarities between Lambani and various other Indian languages. To implement the English-Lambani MT system, we followed the transfer learning approach with English-Kannada as the parent MT model. The implementation and performance of the English-Lambani MT system are discussed in this paper. Since Lambani has been influenced by various other languages, we explored the possibility of getting better MT performance by using parent models associated with related Indian languages. Specifically, we experimented with English-Gujarati and English-Marathi as additional parent models. We compare the performance of three different English-Lambani MT systems derived from three parent language models, and the observations are presented in the paper. Additionally, we will also explore the effect of freezing the encoder layer and decoder layer and the change in performance from both of them.

2020

pdf bib
Lexical Tone Recognition in Mizo using Acoustic-Prosodic Features
Parismita Gogoi | Abhishek Dey | Wendy Lalhminghlui | Priyankoo Sarmah | S R Mahadeva Prasanna
Proceedings of the Twelfth Language Resources and Evaluation Conference

Mizo is an under-studied Tibeto-Burman tonal language of the North-East India. Preliminary research findings have confirmed that four distinct tones of Mizo (High, Low, Rising and Falling) appear in the language. In this work, an attempt is made to automatically recognize four phonological tones in Mizo distinctively using acoustic-prosodic parameters as features. Six features computed from Fundamental Frequency (F0) contours are considered and two classifier models based on Support Vector Machine (SVM) & Deep Neural Network (DNN) are implemented for automatic tonerecognition task respectively. The Mizo database consists of 31950 iterations of the four Mizo tones, collected from 19 speakers using trisyllabic phrases. A four-way classification of tones is attempted with a balanced (equal number of iterations per tone category) dataset for each tone of Mizo. it is observed that the DNN based classifier shows comparable performance in correctly recognizing four phonological Mizo tones as of the SVM based classifier.