Anita Saroj


2020

pdf bib
An Indian Language Social Media Collection for Hate and Offensive Speech
Anita Saroj | Sukomal Pal
Proceedings of the Workshop on Resources and Techniques for User and Author Profiling in Abusive Language

In social media, people express themselves every day on issues that affect their lives. During the parliamentary elections, people’s interaction with the candidates in social media posts reflects a lot of social trends in a charged atmosphere. People’s likes and dislikes on leaders, political parties and their stands often become subject of hate and offensive posts. We collected social media posts in Hindi and English from Facebook and Twitter during the run-up to the parliamentary election 2019 of India (PEI data-2019). We created a dataset for sentiment analysis into three categories: hate speech, offensive and not hate, or not offensive. We report here the initial results of sentiment classification for the dataset using different classifiers.

pdf bib
IRlab@IITV at SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media Using SVM
Anita Saroj | Supriya Chanda | Sukomal Pal
Proceedings of the Fourteenth Workshop on Semantic Evaluation

This paper describes the IRlab@IIT-BHU system for the OffensEval 2020. We take the SVM with TF-IDF features to identify and categorize hate speech and offensive language in social media for two languages. In subtask A, we used a linear SVM classifier to detect abusive content in tweets, achieving a macro F1 score of 0.779 and 0.718 for Arabic and Greek, respectively.