Hasan Mesbaul Ali Taher


2024

pdf bib
CUET_Binary_Hackers@DravidianLangTech EACL2024: Fake News Detection in Malayalam Language Leveraging Fine-tuned MuRIL BERT
Salman Farsi | Asrarul Eusha | Ariful Islam | Hasan Mesbaul Ali Taher | Jawad Hossain | Shawly Ahsan | Avishek Das | Mohammed Moshiul Hoque
Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

Due to technological advancements, various methods have emerged for disseminating news to the masses. The pervasive reach of news, however, has given rise to a significant concern: the proliferation of fake news. In response to this challenge, a shared task in Dravidian- LangTech EACL2024 was initiated to detect fake news and classify its types in the Malayalam language. The shared task consisted of two sub-tasks. Task 1 focused on a binary classification problem, determining whether a piece of news is fake or not. Whereas task 2 delved into a multi-class classification problem, categorizing news into five distinct levels. Our approach involved the exploration of various machine learning (RF, SVM, XGBoost, Ensemble), deep learning (BiLSTM, CNN), and transformer-based models (MuRIL, Indic- SBERT, m-BERT, XLM-R, Distil-BERT) by emphasizing parameter tuning to enhance overall model performance. As a result, we introduce a fine-tuned MuRIL model that leverages parameter tuning, achieving notable success with an F1-score of 0.86 in task 1 and 0.5191 in task 2. This successful implementation led to our system securing the 3rd position in task 1 and the 1st position in task 2. The source code will be found in the GitHub repository at this link: https://github.com/Salman1804102/ DravidianLangTech-EACL-2024-FakeNews.

pdf bib
CUET_NLP_GoodFellows@DravidianLangTech EACL2024: A Transformer-Based Approach for Detecting Fake News in Dravidian Languages
Md Osama | Kawsar Ahmed | Hasan Mesbaul Ali Taher | Jawad Hossain | Shawly Ahsan | Mohammed Moshiul Hoque
Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

In this modern era, many people have been using Facebook and Twitter, leading to increased information sharing and communication. However, a considerable amount of information on these platforms is misleading or intentionally crafted to deceive users, which is often termed as fake news. A shared task on fake news detection in Malayalam organized by DravidianLangTech@EACL 2024 allowed us for addressing the challenge of distinguishing between original and fake news content in the Malayalam language. Our approach involves creating an intelligent framework to categorize text as either fake or original. We experimented with various machine learning models, including Logistic Regression, Decision Tree, Random Forest, Multinomial Naive Bayes, SVM, and SGD, and various deep learning models, including CNN, BiLSTM, and BiLSTM + Attention. We also explored Indic-BERT, MuRIL, XLM-R, and m-BERT for transformer-based approaches. Notably, our most successful model, m-BERT, achieved a macro F1 score of 0.85 and ranked 4th in the shared task. This research contributes to combating misinformation on social media news, offering an effective solution to classify content accurately.

pdf bib
CUET_NLP_Manning@LT-EDI 2024: Transformer-based Approach on Caste and Migration Hate Speech Detection
Md Alam | Hasan Mesbaul Ali Taher | Jawad Hossain | Shawly Ahsan | Mohammed Moshiul Hoque
Proceedings of the Fourth Workshop on Language Technology for Equality, Diversity, Inclusion

The widespread use of online communication has caused a significant increase in the spread of hate speech on social media. However, there are also hate crimes based on caste and migration status. Despite several nations efforts to bring equality among their citizens, numerous crimes occur just based on caste. Migration-based hostility happens both in India and in developed countries. A shared task was arranged to address this issue in a low-resourced language such as Tamil. This paper aims to improve the detection of hate speech and hostility based on caste and migration status on social media. To achieve this, this work investigated several Machine Learning (ML), Deep Learning (DL), and transformer-based models, including M-BERT, XLM-R, and Tamil BERT. Experimental results revealed the highest macro f1-score of 0.80 using the M-BERT model, which enabled us to rank 3rd on the shared task.

2023

pdf bib
NLP_CUET at BLP-2023 Task 1: Fine-grained Categorization of Violence Inciting Text using Transformer-based Approach
Jawad Hossain | Hasan Mesbaul Ali Taher | Avishek Das | Mohammed Moshiul Hoque
Proceedings of the First Workshop on Bangla Language Processing (BLP-2023)

The amount of online textual content has increased significantly in recent years through social media posts, online chatting, web portals, and other digital platforms due to the significant increase in internet users and their unprompted access via digital devices. Unfortunately, the misappropriation of textual communication via the Internet has led to violence-inciting texts. Despite the availability of various forms of violence-inciting materials, text-based content is often used to carry out violent acts. Thus, developing a system to detect violence-inciting text has become vital. However, creating such a system in a low-resourced language like Bangla becomes challenging. Therefore, a shared task has been arranged to detect violence-inciting text in Bangla. This paper presents a hybrid approach (GAN+Bangla-ELECTRA) to classify violence-inciting text in Bangla into three classes: direct, passive, and non-violence. We investigated a variety of deep learning (CNN, BiLSTM, BiLSTM+Attention), machine learning (LR, DT, MNB, SVM, RF, SGD), transformers (BERT, ELECTRA), and GAN-based models to detect violence inciting text in Bangla. Evaluation results demonstrate that the GAN+Bangla-ELECTRA model gained the highest macro f1-score (74.59), which obtained us a rank of 3rd position at the BLP-2023 Task 1.