2023
pdf
bib
abs
KingsmanTrio at SemEval-2023 Task 10: Analyzing the Effectiveness of Transfer Learning Models for Explainable Online Sexism Detection
Fareen Tasneem
|
Tashin Hossain
|
Jannatun Naim
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Online social platforms are now propagating sexist content endangering the involvement and inclusion of women on these platforms. Sexism refers to hostility, bigotry, or discrimination based on gender, typically against women. The proliferation of such notions deters women from engaging in social media spontaneously. Hence, detecting sexist content is critical to ensure a safe online platform where women can participate without the fear of being a target of sexism. This paper describes our participation in subtask A of SemEval-2023 Task 10: Explainable Detection of Online Sexism (EDOS). This subtask requires classifying textual content as sexist or not sexist. We incorporate a RoBERTa-based architecture and further finetune the hyperparameters to entail better performance. The procured results depict the competitive performance of our approach among the other participants.
pdf
bib
abs
CSECU-DSG at SemEval-2023 Task 6: Segmenting Legal Documents into Rhetorical Roles via Fine-tuned Transformer Architecture
Fareen Tasneem
|
Tashin Hossain
|
Jannatun Naim
|
Abu Nowshed Chy
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Automated processing of legal documents is essential to manage the enormous volume of legal corpus and to make it easily accessible to a broad spectrum of people. But due to the amorphous and variable nature of legal documents, it is very challenging to directly proceed with complicated processes such as summarization, analysis, and query. Segmenting the documents as per the rhetorical roles can aid and accelerate such procedures. This paper describes our participation in SemEval-2023 task 6: Sub-task A: Rhetorical Roles Prediction. We utilize a finetuned Legal-BERT to address this task. We also conduct an error analysis to illustrate the shortcomings of our deployed approach.
2021
pdf
bib
abs
CSECU-DSG at SemEval-2021 Task 5: Leveraging Ensemble of Sequence Tagging Models for Toxic Spans Detection
Tashin Hossain
|
Jannatun Naim
|
Fareen Tasneem
|
Radiathun Tasnia
|
Abu Nowshed Chy
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
The upsurge of prolific blogging and microblogging platforms enabled the abusers to spread negativity and threats greater than ever. Detecting the toxic portions substantially aids to moderate or exclude the abusive parts for maintaining sound online platforms. This paper describes our participation in the SemEval 2021 toxic span detection task. The task requires detecting spans that convey toxic remarks from the given text. We explore an ensemble of sequence labeling models including the BiLSTM-CRF, spaCy NER model with custom toxic tags, and fine-tuned BERT model to identify the toxic spans. Finally, a majority voting ensemble method is used to determine the unified toxic spans. Experimental results depict the competitive performance of our model among the participants.
pdf
bib
abs
CSECU-DSG at SemEval-2021 Task 6: Orchestrating Multimodal Neural Architectures for Identifying Persuasion Techniques in Texts and Images
Tashin Hossain
|
Jannatun Naim
|
Fareen Tasneem
|
Radiathun Tasnia
|
Abu Nowshed Chy
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
Inscribing persuasion techniques in memes is the most impactful way to influence peoples’ mindsets. People are more inclined to memes as they are more stimulating and convincing and hence memes are often exploited by tactfully engraving propaganda in its context with the intent of attaining specific agenda. This paper describes our participation in the three subtasks featured by SemEval 2021 task 6 on the detection of persuasion techniques in texts and images. We utilize a fusion of logistic regression, decision tree, and fine-tuned DistilBERT for tackling subtask 1. As for subtask 2, we propose a system that consolidates a span identification model and a multi-label classification model based on pre-trained BERT. We address the multi-modal multi-label classification of memes defined in subtask 3 by utilizing a ResNet50 based image model, DistilBERT based text model, and a multi-modal architecture based on multikernel CNN+LSTM and MLP model. The outcomes illustrated the competitive performance of our systems.
2020
pdf
bib
abs
CSECU-DSG at WNUT-2020 Task 2: Exploiting Ensemble of Transfer Learning and Hand-crafted Features for Identification of Informative COVID-19 English Tweets
Fareen Tasneem
|
Jannatun Naim
|
Radiathun Tasnia
|
Tashin Hossain
|
Abu Nowshed Chy
Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)
COVID-19 pandemic has become the trending topic on twitter and people are interested in sharing diverse information ranging from new cases, healthcare guidelines, medicine, and vaccine news. Such information assists the people to be updated about the situation as well as beneficial for public safety personnel for decision making. However, the informal nature of twitter makes it challenging to refine the informative tweets from the huge tweet streams. To address these challenges WNUT-2020 introduced a shared task focusing on COVID-19 related informative tweet identification. In this paper, we describe our participation in this task. We propose a neural model that adopts the strength of transfer learning and hand-crafted features in a unified architecture. To extract the transfer learning features, we utilize the state-of-the-art pre-trained sentence embedding model BERT, RoBERTa, and InferSent, whereas various twitter characteristics are exploited to extract the hand-crafted features. Next, various feature combinations are utilized to train a set of multilayer perceptron (MLP) as the base-classifier. Finally, a majority voting based fusion approach is employed to determine the informative tweets. Our approach achieved competitive performance and outperformed the baseline by 7% (approx.).