Ishan Sanjeev Upadhyay

2022

Sammaan@LT-EDI-ACL2022: Ensembled Transformers Against Homophobia and Transphobia
Ishan Sanjeev Upadhyay | Kv Aditya Srivatsa | Radhika Mamidi
Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion

Hateful and offensive content on social media platforms can have negative effects on users and can make online communities more hostile towards certain people and hamper equality, diversity and inclusion. In this paper, we describe our approach to classify homophobia and transphobia in social media comments. We used an ensemble of transformer-based models to build our classifier. Our model ranked 2nd for English, 8th for Tamil and 10th for Tamil-English.

pdf bib abs

Towards Toxic Positivity Detection
Ishan Sanjeev Upadhyay | KV Aditya Srivatsa | Radhika Mamidi
Proceedings of the Tenth International Workshop on Natural Language Processing for Social Media

Over the past few years, there has been a growing concern around toxic positivity on social media which is a phenomenon where positivity is used to minimize one’s emotional experience. In this paper, we create a dataset for toxic positivity classification from Twitter and an inspirational quote website. We then perform benchmarking experiments using various text classification models and show the suitability of these models for the task. We achieved a macro F1 score of 0.71 and a weighted F1 score of 0.85 by using an ensemble model. To the best of our knowledge, our dataset is the first such dataset created.

2021

pdf bib abs

Hopeful Men@LT-EDI-EACL2021: Hope Speech Detection Using Indic Transliteration and Transformers
Ishan Sanjeev Upadhyay | Nikhil E | Anshul Wadhawan | Radhika Mamidi
Proceedings of the First Workshop on Language Technology for Equality, Diversity and Inclusion

This paper aims to describe the approach we used to detect hope speech in the HopeEDI dataset. We experimented with two approaches. In the first approach, we used contextual embeddings to train classifiers using logistic regression, random forest, SVM, and LSTM based models. The second approach involved using a majority voting ensemble of 11 models which were obtained by fine-tuning pre-trained transformer models (BERT, ALBERT, RoBERTa, IndicBERT) after adding an output layer. We found that the second approach was superior for English, Tamil and Malayalam. Our solution got a weighted F1 score of 0.93, 0.75 and 0.49 for English, Malayalam and Tamil respectively. Our solution ranked 1st in English, 8th in Malayalam and 11th in Tamil.

pdf bib abs

IIITH at SemEval-2021 Task 7: Leveraging transformer-based humourous and offensive text detection architectures using lexical and hurtlex features and task adaptive pretraining
Tathagata Raha | Ishan Sanjeev Upadhyay | Radhika Mamidi | Vasudeva Varma
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

This paper describes our approach (IIITH) for SemEval-2021 Task 5: HaHackathon: Detecting and Rating Humor and Offense. Our results focus on two major objectives: (i) Effect of task adaptive pretraining on the performance of transformer based models (ii) How does lexical and hurtlex features help in quantifying humour and offense. In this paper, we provide a detailed description of our approach along with comparisions mentioned above.

Co-authors

Anshul Wadhawan 1

Venues

Fix author