Rajalakshmi Sivanaiah


2023

pdf bib
Athena@DravidianLangTech: Abusive Comment Detection in Code-Mixed Languages using Machine Learning Techniques
Hema M | Anza Prem | Rajalakshmi Sivanaiah | Angel Deborah S
Proceedings of the Third Workshop on Speech and Language Technologies for Dravidian Languages

The amount of digital material that is disseminated through various social media platforms has significantly increased in recent years. Online networks have gained popularity in recent years and have established themselves as goto resources for news, information, and entertainment. Nevertheless, despite the many advantages of using online networks, mounting evidence indicates that an increasing number of malicious actors are taking advantage of these networks to spread poison and hurt other people. This work aims to detect abusive content in youtube comments written in the languages like Tamil, Tamil-English (codemixed), Telugu-English (code-mixed). This work was undertaken as part of the “DravidianLangTech@ RANLP 2023” shared task. The Macro F1 values for the Tamil, Tamil-English, and Telugu-English datasets were 0.28, 0.37, and 0.6137 and secured 5th, 7th, 8th rank respectively.

pdf bib
Avalanche at DravidianLangTech: Abusive Comment Detection in Code Mixed Data Using Machine Learning Techniques with Under Sampling
Rajalakshmi Sivanaiah | Rajasekar S | Srilakshmisai K | Angel Deborah S | Mirnalinee ThankaNadar
Proceedings of the Third Workshop on Speech and Language Technologies for Dravidian Languages

In recent years, the growth of online platforms and social media has given rise to a concerning increase in the presence of abusive content. This poses significant challenges for maintaining a safe and inclusive digital environment. In order to resolve this issue, this paper experiments an approach for detecting abusive comments. We are using a combination of pipelining and vectorization techniques, along with algorithms such as the stochastic gradient descent (SGD) classifier and support vector machine (SVM) classifier. We conducted experiments on an Tamil-English code mixed dataset to evaluate the performance of this approach. Using the stochastic gradient descent classifier algorithm, we achieved a weighted F1 score of 0.76 and a macro score of 0.45 for development dataset. Furthermore, by using the support vector machine classifier algorithm, we obtained a weighted F1 score of 0.78 and a macro score of 0.42 for development dataset. With the test dataset, SGD approach secured 5th rank with 0.44 macro F1 score, while SVM scored 8th rank with 0.35 macro F1 score in the shared task. The top rank team secured 0.55 macro F1 score.

pdf bib
TechSSN at SemEval-2023 Task 12: Monolingual Sentiment Classification in Hausa Tweets
Nishaanth Ramanathan | Rajalakshmi Sivanaiah | Angel Deborah S | Mirnalinee Thanka Nadar Thanagathai
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This paper elaborates on our work in designing a system for SemEval 2023 Task 12: AfriSentiSemEval, which involves sentiment analysis for low-resource African languages using the Twitter dataset. We utilised a pre-trained model to perform sentiment classification in Hausa language tweets. We used a multilingual version of the roBERTa model, which is pretrained on 100 languages, to classify sentiments in Hausa. To tokenize the text, we used the AfriBERTa model, which is specifically pretrained on African languages.

pdf bib
TechSSN1 at LT-EDI-2023: Depression Detection and Classification using BERT Model for Social Media Texts
Venkatasai Ojus Yenumulapalli | Vijai Aravindh R | Rajalakshmi Sivanaiah | Angel Deborah S
Proceedings of the Third Workshop on Language Technology for Equality, Diversity and Inclusion

Depression is a severe mental health disorder characterized by persistent feelings of sadness and anxiety, a decline in cognitive functioning resulting in drastic changes in a human’s psychological and physical well-being. However, depression is curable completely when treated at a suitable time and treatment resulting in the rejuvenation of an individual. The objective of this paper is to devise a technique for detecting signs of depression from English social media comments as well as classifying them based on their intensity into severe, moderate, and not depressed categories. The paper illustrates three approaches that are developed when working toward the problem. Of these approaches, the BERT model proved to be the most suitable model with an F1 macro score of 0.407, which gave us the 11th rank overall.

pdf bib
SSNTech2@LT-EDI-2023: Homophobia/Transphobia Detection in Social Media Comments Using Linear Classification Techniques
Vaidhegi D | Priya M | Rajalakshmi Sivanaiah | Angel Deborah S | Mirnalinee ThankaNadar
Proceedings of the Third Workshop on Language Technology for Equality, Diversity and Inclusion

The abusive content on social media networks is causing destructive effects on the mental well-being of online users. Homophobia refers to the fear, negative attitudes and feeling towards homosexuality. Transphobia refer to negative attitudes, hatred and prejudice towards transsexual people. Even though, some parts of the society have started to accept homosexuality and transsexuality, there are still a large set of the population opposing it. Hate speech targeting LGBTQ+ individuals, known as homophobia/transphobia speech, has become a growing concern. This has led to a toxic and unwelcoming environment for LGBTQ+ people on online platforms. This poses a significant societal issue, hindering the progress of equality, diversity, and inclusion. The identification of homophobic and transphobic comments on social media platforms plays a crucial role in creating a safer environment for all social media users. In order to accomplish this, we built a machine learning model using SGD and SVM classifier. Our approach yielded promising results, with a weighted F1-score of 0.95 on the English dataset and we secured 4th rank in this task.

pdf bib
TechSSN4@LT-EDI-2023: Depression Sign Detection in Social Media Postings using DistilBERT Model
Krupa Elizabeth Thannickal | Sanmati P | Rajalakshmi Sivanaiah | Angel Deborah S
Proceedings of the Third Workshop on Language Technology for Equality, Diversity and Inclusion

As world population increases, more people are living to the age when depression or Major Depressive Disorder (MDD) commonly occurs. Consequently, the number of those who suffer from such disorders is rising. There is a pressing need for faster and reliable diagnosis methods. This paper proposes the method to analyse text input from social media posts of subjects to determine the severity class of depression. We have used the DistilBERT transformer to process these texts and classify the individuals across three severity labels - ‘not depression’, ‘moderate’ and ‘severe’. The results showed the macro F1-score of 0.437 when the model was trained for 5 epochs with a comparative performance across the labels.The team acquired 6th rank while the top team scored macro F1-score as 0.470. We hope that this system will support further research into the early identification of depression in individuals to promote effective medical research and related treatments.

pdf bib
The Mavericks@LT-EDI-2023: Detection of signs of Depression from social Media Texts using Navie Bayse approach
Sathvika V S | Vaishnavi Vaishnavi S | Angel Deborah S | Rajalakshmi Sivanaiah | Mirnalinee ThankaNadar
Proceedings of the Third Workshop on Language Technology for Equality, Diversity and Inclusion

Social media platforms have revolutionized the landscape of communication, providing individuals with an outlet to express their thoughts, emotions, and experiences openly. This paper focuses on the development of a model to determine whether individuals exhibit signs of depression based on their social media texts. With the aim of optimizing performance and accuracy, a Naive Bayes approach was chosen for the detection task.The Naive Bayes algorithm, a probabilistic classifier, was applied to extract features and classify the texts. The model leveraged linguistic patterns, sentiment analysis, and other relevant features to capture indicators of depression within the texts. Preprocessing techniques, including tokenization, stemming, and stop-word removal, were employed to enhance the quality of the input data.The performance of the Naive Bayes model was evaluated using standard metrics such as accuracy, precision, recall, and F1-score, it acheived a macro- avergaed F1 score of 0.263.

2022

pdf bib
TechSSN at SemEval-2022 Task 5: Multimedia Automatic Misogyny Identification using Deep Learning Models
Rajalakshmi Sivanaiah | Angel S | Sakaya Milton Rajendram | Mirnalinee T T
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

Research is progressing in a fast manner in the field of offensive, hate speech, abusive and sarcastic data. Tackling hate speech against women is urgent and really needed to give respect to the lady of our life. This paper describes the system used for identifying misogynous content using images and text. The system developed by the team TECHSSN uses transformer models to detect the misogynous content from text and Convolutional Neural Network model for image data. Various models like BERT, ALBERT, XLNET and CNN are explored and the combination of ALBERT and CNN as an ensemble model provides better results than the rest. This system was developed for the task 5 of the competition, SemEval 2022.

pdf bib
TechSSN at SemEval-2022 Task 6: Intended Sarcasm Detection using Transformer Models
Ramdhanush V | Rajalakshmi Sivanaiah | Angel S | Sakaya Milton Rajendram | Mirnalinee T T
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

Irony detection in the social media is an upcoming research which places a main role in sentiment analysis and offensive language identification. Sarcasm is one form of irony that is used to provide intended comments against realism. This paper describes a method to detect intended sarcasm in text (SemEval-2022 Task 6). The TECHSSN team used Bidirectional Encoder Representations from Transformers (BERT) models and its variants to classify the text as sarcastic or non-sarcastic in English and Arabic languages. The data is preprocessed and fed to the model for training. The transformer models learn the weights during the training phase from the given dataset and predicts the output class labels for the unseen test data.

pdf bib
SSN_MLRG1 at SemEval-2022 Task 10: Structured Sentiment Analysis using 2-layer BiLSTM
Karun Anantharaman | Divyasri K | Jayannthan Pt | Angel S | Rajalakshmi Sivanaiah | Sakaya Milton Rajendram | Mirnalinee T T
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

Task 10 in SemEval 2022 is a composite task which entails analysis of opinion tuples, and recognition and demarcation of their nature. In this paper, we will elaborate on how such a methodology is implemented, how it is undertaken for a Structured Sentiment Analysis, and the results obtained thereof. To achieve this objective, we have adopted a bi-layered BiLSTM approach. In our research, a variation on the norm has been effected towards enhancement of accuracy, by basing the categorization meted out to an individual member as a by-product of its adjacent members, using specialized algorithms to ensure the veracity of the output, which has been modelled to be the holistically most accurate label for the entire sequence. Such a strategy is superior in terms of its parsing accuracy and requires less time. This manner of action has yielded an SF1 of 0.33 in the highest-performing configuration.

pdf bib
SSN_MLRG1@DravidianLangTech-ACL2022: Troll Meme Classification in Tamil using Transformer Models
Shruthi Hariprasad | Sarika Esackimuthu | Saritha Madhavan | Rajalakshmi Sivanaiah | Angel S
Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages

The ACL shared task of DravidianLangTech-2022 for Troll Meme classification is a binary classification task that involves identifying Tamil memes as troll or not-troll. Classification of memes is a challenging task since memes express humour and sarcasm in an implicit way. Team SSN_MLRG1 tested and compared results obtained by using three models namely BERT, ALBERT and XLNET. The XLNet model outperformed the other two models in terms of various performance metrics. The proposed XLNet model obtained the 3rd rank in the shared task with a weighted F1-score of 0.558.

pdf bib
Varsini_and_Kirthanna@DravidianLangTech-ACL2022-Emotional Analysis in Tamil
Varsini S | Kirthanna Rajan | Angel S | Rajalakshmi Sivanaiah | Sakaya Milton Rajendram | Mirnalinee T T
Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages

In this paper, we present our system for the task of Emotion analysis in Tamil. Over 3.96 million people use these platforms to send messages formed using texts, images, videos, audio or combinations of these to express their thoughts and feelings. Text communication on social media platforms is quite overwhelming due to its enormous quantity and simplicity. The data must be processed to understand the general feeling felt by the author. We present a lexicon-based approach for the extraction emotion in Tamil texts. We use dictionaries of words labelled with their respective emotions. The process of assigning an emotional label to each text, and then capture the main emotion expressed in it. Finally, the F1-score in the official test set is 0.0300 and our method ranks 5th.

pdf bib
SSN_ARMM@ LT-EDI -ACL2022: Hope Speech Detection for Equality, Diversity, and Inclusion Using ALBERT model
Praveenkumar Vijayakumar | Prathyush S | Aravind P | Angel S | Rajalakshmi Sivanaiah | Sakaya Milton Rajendram | Mirnalinee T T
Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion

In recent years social media has become one of the major forums for expressing human views and emotions. With the help of smartphones and high-speed internet, anyone can express their views on Social media. However, this can also lead to the spread of hatred and violence in society. Therefore it is necessary to build a method to find and support helpful social media content. In this paper, we studied Natural Language Processing approach for detecting Hope speech in a given sentence. The task was to classify the sentences into ‘Hope speech’ and ‘Non-hope speech’. The dataset was provided by LT-EDI organizers with text from Youtube comments. Based on the task description, we developed a system using the pre-trained language model BERT to complete this task. Our model achieved 1st rank in the Kannada language with a weighted average F1 score of 0.750, 2nd rank in the Malayalam language with a weighted average F1 score of 0.740, 3rd rank in the Tamil language with a weighted average F1 score of 0.390 and 6th rank in the English language with a weighted average F1 score of 0.880.

pdf bib
SSN_MLRG3 @LT-EDI-ACL2022-Depression Detection System from Social Media Text using Transformer Models
Sarika Esackimuthu | Shruthi Hariprasad | Rajalakshmi Sivanaiah | Angel S | Sakaya Milton Rajendram | Mirnalinee T T
Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion

Depression is a common mental illness that involves sadness and lack of interest in all day-to-day activities. The task is to classify the social media text as signs of depression into three labels namely “not depressed”, “moderately depressed”, and “severely depressed”. We have build a system using Deep Learning Model “Transformers”. Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio. The multi-class classification model used in our system is based on the ALBERT model. In the shared task ACL 2022, Our team SSN_MLRG3 obtained a Macro F1 score of 0.473.

pdf bib
SSN_MLRG1@LT-EDI-ACL2022: Multi-Class Classification using BERT models for Detecting Depression Signs from Social Media Text
Karun Anantharaman | Angel S | Rajalakshmi Sivanaiah | Saritha Madhavan | Sakaya Milton Rajendram
Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion

DepSign-LT-EDI@ACL-2022 aims to ascer-tain the signs of depression of a person fromtheir messages and posts on social mediawherein people share their feelings and emo-tions. Given social media postings in English,the system should classify the signs of depres-sion into three labels namely “not depressed”,“moderately depressed”, and “severely de-pressed”. To achieve this objective, we haveadopted a fine-tuned BERT model. This solu-tion from team SSN_MLRG1 achieves 58.5%accuracy on the DepSign-LT-EDI@ACL-2022test set.

2021

pdf bib
TECHSSN at SemEval-2021 Task 7: Humor and Offense detection and classification using ColBERT embeddings
Rajalakshmi Sivanaiah | Angel Deborah S | S Milton Rajendram | Mirnalinee Tt | Abrit Pal Singh | Aviansh Gupta | Ayush Nanda
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

This paper describes the system used for detecting humor in text. The system developed by the team TECHSSN uses binary classification techniques to classify the text. The data undergoes preprocessing and is given to ColBERT (Contextualized Late Interaction over BERT), a modification of Bidirectional Encoder Representations from Transformers (BERT). The model is re-trained and the weights are learned for the dataset. This system was developed for the task 7 of the competition, SemEval 2021.

2020

pdf bib
TECHSSN at SemEval-2020 Task 12: Offensive Language Detection Using BERT Embeddings
Rajalakshmi Sivanaiah | Angel Suseelan | S Milton Rajendram | Mirnalinee T.t.
Proceedings of the Fourteenth Workshop on Semantic Evaluation

This paper describes the work of identifying the presence of offensive language in social media posts and categorizing a post as targeted to a particular person or not. The work developed by team TECHSSN for solving the Multilingual Offensive Language Identification in Social Media (Task 12) in SemEval-2020 involves the use of deep learning models with BERT embeddings. The dataset is preprocessed and given to a Bidirectional Encoder Representations from Transformers (BERT) model with pretrained weight vectors. The model is retrained and the weights are learned for the offensive language dataset. We have developed a system with the English language dataset. The results are better when compared to the model we developed in SemEval-2019 Task6.