Gerry Dozier
2023
AU_NLP at SemEval-2023 Task 10: Explainable Detection of Online Sexism Using Fine-tuned RoBERTa
Amit Das
|
Nilanjana Raychawdhary
|
Tathagata Bhattacharya
|
Gerry Dozier
|
Cheryl D. Seals
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Social media is a concept developed to link people and make the globe smaller. But it has recently developed into a center for sexist memes that target especially women. As a result, there are more events of hostile actions and harassing remarks present online. In this paper, we introduce our system for the task of online sexism detection, a part of SemEval 2023 task 10. We introduce fine-tuned RoBERTa model to address this specific problem. The efficiency of the proposed strategy is demonstrated by the experimental results reported in this research.
Seals_Lab at SemEval-2023 Task 12: Sentiment Analysis for Low-resource African Languages, Hausa and Igbo
Nilanjana Raychawdhary
|
Amit Das
|
Gerry Dozier
|
Cheryl D. Seals
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
One of the most extensively researched applications in natural language processing (NLP) is sentiment analysis. While the majority of the study focuses on high-resource languages (e.g., English), this research will focus on low-resource African languages namely Igbo and Hausa. The annotated tweets of both languages have a significant number of code-mixed tweets. The curated datasets necessary to build complex AI applications are not available for the majority of African languages. To optimize the use of such datasets, research is needed to determine the viability of present NLP procedures as well as the development of novel techniques. This paper outlines our efforts to develop a sentiment analysis (for positive and negative as well as neutral) system for tweets from the Hausa, and Igbo languages. Sentiment analysis can computationally analyze and discover sentiments in a text or document. We worked on the first thorough compilation of AfriSenti-SemEval 2023 Shared Task 12 Twitter datasets that are human-annotated for the most widely spoken languages in Nigeria, such as Hausa and Igbo. Here we trained the modern pre-trained language model AfriBERTa large on the AfriSenti-SemEval Shared Task 12 Twitter dataset to create sentiment classification. In particular, the results demonstrate that our model trained on AfriSenti-SemEval Shared Task 12 datasets and produced with an F1 score of 80.85% for Hausa and 80.82% for Igbo languages on the sentiment analysis test. In AfriSenti-SemEval 2023 shared task 12 (Task A), we consistently ranked top 10 by achieving a mean F1 score of more than 80% for both the Hausa and Igbo languages.