G. M. Shahariar
Also published as: G M Shahariar
2024
A Comparative Analysis of Noise Reduction Methods in Sentiment Analysis on Noisy Bangla Texts
Kazi Elahi
|
Tasnuva Rahman
|
Shakil Shahriar
|
Samir Sarker
|
Md. Shawon
|
G. M. Shahariar
Proceedings of the Ninth Workshop on Noisy and User-generated Text (W-NUT 2024)
While Bangla is considered a language with limited resources, sentiment analysis has been a subject of extensive research in the literature. Nevertheless, there is a scarcity of exploration into sentiment analysis specifically in the realm of noisy Bangla texts. In this paper, we introduce a dataset (NC-SentNoB) that we annotated manually to identify ten different types of noise found in a pre-existing sentiment analysis dataset comprising of around 15K noisy Bangla texts. At first, given an input noisy text, we identify the noise type, addressing this as a multi-label classification task. Then, we introduce baseline noise reduction methods to alleviate noise prior to conducting sentiment analysis. Finally, we assess the performance of fine-tuned sentiment analysis models with both noisy and noise-reduced texts to make comparisons. The experimental findings indicate that the noise reduction methods utilized are not satisfactory, highlighting the need for more suitable noise reduction methods in future research endeavors. We have made the implementation and dataset presented in this paper publicly available at https://github.com/ktoufiquee/A-Comparative-Analysis-of-Noise-Reduction-Methods-in-Sentiment-Analysis-on-Noisy-Bangla-Texts
Adversarial Attacks on Parts of Speech: An Empirical Study in Text-to-Image Generation
G M Shahariar
|
Jia Chen
|
Jiachen Li
|
Yue Dong
Findings of the Association for Computational Linguistics: EMNLP 2024
Recent studies show that text-to-image (T2I) models are vulnerable to adversarial attacks, especially with noun perturbations in text prompts. In this study, we investigate the impact of adversarial attacks on different POS tags within text prompts on the images generated by T2I models. We create a high-quality dataset for realistic POS tag token swapping and perform gradient-based attacks to find adversarial suffixes that mislead T2I models into generating images with altered tokens. Our empirical results show that the attack success rate (ASR) varies significantly among different POS tag categories, with nouns, proper nouns, and adjectives being the easiest to attack. We explore the mechanism behind the steering effect of adversarial suffixes, finding that the number of critical tokens and information fusion vary among POS tags, while features like suffix transferability are consistent across categories.
Search
Co-authors
- Kazi Elahi 1
- Tasnuva Rahman 1
- Shakil Shahriar 1
- Samir Sarker 1
- Md. Shawon 1
- show all...