Ankit Vaidya


2024

pdf bib
PICT at StanceEval2024: Stance Detection in Arabic using Ensemble of Large Language Models
Ishaan Shukla | Ankit Vaidya | Geetanjali Kale
Proceedings of The Second Arabic Natural Language Processing Conference

This paper outlines our approach to the StanceEval 2024- Arabic Stance Evaluation shared task. The goal of the task was to identify the stance, one out of three (Favor, Against or None) towards tweets based on three topics, namely- COVID-19 Vaccine, Digital Transformation and Women Empowerment. Our approach consists of fine-tuning BERT-based models efficiently for both, Single-Task Learning as well as Multi-Task Learning, the details of which are discussed. Finally, an ensemble was implemented on the best-performing models to maximize overall performance. We achieved a macro F1 score of 78.02% in this shared task. Our codebase is available publicly.

pdf bib
CLTeam1 at SemEval-2024 Task 10: Large Language Model based ensemble for Emotion Detection in Hinglish
Ankit Vaidya | Aditya Gokhale | Arnav Desai | Ishaan Shukla | Sheetal Sonawane
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This paper outlines our approach for the ERC subtask of the SemEval 2024 EdiREF Shared Task. In this sub-task, an emotion had to be assigned to an utterance which was the part of a dialogue. The utterance had to be classified into one of the following classes- disgust, contempt, anger, neutral, joy, sadness, fear, surprise. Our proposed system makes use of an ensemble of language specific RoBERTA and BERT models to tackle the problem. A weighted F1-score of 44% was achieved by our system in this task. We conducted comprehensive ablations and suggested directions of future work. Our codebase is available publicly.

2023

pdf bib
Two-stage Pipeline for Multilingual Dialect Detection
Ankit Vaidya | Aditya Kane
Tenth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2023)

Dialect Identification is a crucial task for localizing various Large Language Models. This paper outlines our approach to the VarDial 2023 shared task. Here we have to identify three or two dialects from three languages each which results in a 9-way classification for Track-1 and 6-way classification for Track-2 respectively. Our proposed approach consists of a two-stage system and outperforms other participants’ systems and previous works in this domain. We achieve a score of 58.54% for Track-1 and 85.61% for Track-2. Our codebase is available publicly (https://github.com/ankit-vaidya19/EACL_VarDial2023).