Anirban Bhowmick
2021
Sentiment Analysis For Bengali Using Transformer Based Models
Anirban Bhowmick
|
Abhik Jana
Proceedings of the 18th International Conference on Natural Language Processing (ICON)
Sentiment analysis is one of the key Natural Language Processing (NLP) tasks that has been attempted by researchers extensively for resource-rich languages like English. But for low resource languages like Bengali very few attempts have been made due to various reasons including lack of corpora to train machine learning models or lack of gold standard datasets for evaluation. However, with the emergence of transformer models pre-trained in several languages, researchers are showing interest to investigate the applicability of these models in several NLP tasks, especially for low resource languages. In this paper, we investigate the usefulness of two pre-trained transformers models namely multilingual BERT and XLM-RoBERTa (with fine-tuning) for sentiment analysis for the Bengali Language. We use three datasets for the Bengali language for evaluation and produce state-of-the-art performance, even reaching a maximum of 95% accuracy for a two-class sentiment classification task. We believe, this work can serve as a good benchmark as far as sentiment analysis for the Bengali language is concerned.
How Hateful are Movies? A Study and Prediction on Movie Subtitles
Niklas von Boguszewski
|
Sana Moin
|
Anirban Bhowmick
|
Seid Muhie Yimam
|
Chris Biemann
Proceedings of the 17th Conference on Natural Language Processing (KONVENS 2021)