Akshatha Anbalagan
2024
WordWizards@DravidianLangTech 2024:Fake News Detection in Dravidian Languages using Cross-lingual Sentence Embeddings
Akshatha Anbalagan
|
Priyadharshini T
|
Niranjana A
|
Shreedevi Balaji
|
Durairaj Thenmozhi
Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
The proliferation of fake news in digital media has become a significant societal concern, impacting public opinion, trust, and decision-making. This project focuses on the development of machine learning models for the detection of fake news. Leveraging a dataset containing both genuine and deceptive news articles, the proposed models employ natural language processing techniques, feature extraction and classification algorithms. This paper provides a solution to Fake News Detection in Dravidian Languages - DravidianLangTech 2024. There are two sub tasks: Task 1 - The goal of this task is to classify a given social media text into original or fake. We propose an approach for this with the help of a supervised machine learning model – SVM (Support Vector Machine). The SVM classifier achieved a macro F1 score of 0.78 in test data and a rank 11. The Task 2 is classifying fake news articles in Malayalam language into different categories namely False, Half True, Mostly False, Partly False and Mostly True.We have used Naive Bayes which achieved macro F1-score 0.3517 in test data and a rank 6.
WordWizards@DravidianLangTech 2024: Sentiment Analysis in Tamil and Tulu using Sentence Embedding
Shreedevi Balaji
|
Akshatha Anbalagan
|
Priyadharshini T
|
Niranjana A
|
Durairaj Thenmozhi
Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
Sentiment Analysis of Dravidian Languages has begun to garner attention recently as there is more need to analyze emotional responses and subjective opinions present in social media text. As this data is code-mixed and there are not many solutions to code-mixed text out there, we present to you a stellar solution to DravidianLangTech 2024: Sentiment Analysis in Tamil and Tulu task. To understand the sentiment of social media text, we used pre-trained transformer models and feature extraction vectorizers to classify the data with results that placed us 11th in the rankings for the Tamil task and 8th for the Tulu task with a accuracy F1 score of 0.12 and 0.30 which shows the efficiency of our approach.