Balaji Radhakrishnan


2023

pdf bib
SRI-B’s Systems for IWSLT 2023 Dialectal and Low-resource Track: Marathi-Hindi Speech Translation
Balaji Radhakrishnan | Saurabh Agrawal | Raj Prakash Gohil | Kiran Praveen | Advait Vinay Dhopeshwarkar | Abhishek Pandey
Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)

This paper describes the speech translation systems SRI-B developed for the IWSLT 2023 Evaluation Campaign Dialectal and Low-resource track: Marathi-Hindi Speech Translation. We propose systems for both the constrained (systems are trained only on the datasets provided by the organizers) and the unconstrained conditions (systems can be trained with any resource). For both the conditions, we build end-to-end speech translation networks comprising of a conformer encoder and a transformer decoder. Under both the conditions, we leverage Marathi Automatic Speech Recognition (ASR) data to pre-train the encoder and subsequently train the entire model on the speech translation data. Our results demonstrate that pre-training the encoder with ASR data is a key step in significantly improving the speech translation performance. We also show that conformer encoders are inherently superior to its transformer counterparts for speech translation tasks. Our primary submissions achieved a BLEU% score of 31.2 on the constrained condition and 32.4 on the unconstrained condition. We secured the top position in the constrained condition and second position in the unconstrained condition.

2021

pdf bib
Classification of mental illnesses on social media using RoBERTa
Ankit Murarka | Balaji Radhakrishnan | Sushma Ravichandran
Proceedings of the 12th International Workshop on Health Text Mining and Information Analysis

Given the current social distancing regulations across the world, social media has become the primary mode of communication for most people. This has isolated millions suffering from mental illnesses who are unable to receive assistance in person. They have increasingly turned to online platforms to express themselves and to look for guidance in dealing with their illnesses. Keeping this in mind, we propose a solution to classify mental illness posts on social media thereby enabling users to seek appropriate help. In this work, we classify five prominent kinds of mental illnesses- depression, anxiety, bipolar disorder, ADHD and PTSD by analyzing unstructured user data on Reddit. In addition, we share a new high-quality dataset1 to drive research on this topic. The dataset consists of the title and post texts from 17159 posts and 13 subreddits each associated with one of the five mental illnesses listed above or a None class indicating the absence of any mental illness. Our model is trained on Reddit data but is easily extensible to other social media platforms as well as demonstrated in our results. We believe that our work is the first multi-class model that uses a Transformer based architecture such as RoBERTa to analyze people’s emotions and psychology. We also demonstrate how we stress test our model using behavioral testing. Our dataset is publicly available and we encourage researchers to utilize this to advance research in this arena. We hope that this work contributes to the public health system by automating some of the detection process and alerting relevant authorities about users that need immediate help.