Pradeep Roy
2022
IIITSurat@LT-EDI-ACL2022: Hope Speech Detection using Machine Learning
Pradeep Roy
|
Snehaan Bhawal
|
Abhinav Kumar
|
Bharathi Raja Chakravarthi
Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion
This paper addresses the issue of Hope Speech detection using machine learning techniques. Designing a robust model that helps in predicting the target class with higher accuracy is a challenging task in machine learning, especially when the distribution of the class labels is highly imbalanced. This study uses and compares the experimental outcomes of the different oversampling techniques. Many models are implemented to classify the comments into Hope and Non-Hope speech, and it found that machine learning algorithms perform better than deep learning models. The English language dataset used in this research was developed by collecting YouTube comments and is part of the task “ACL-2022:Hope Speech Detection for Equality, Diversity, and Inclusion”. The proposed model achieved a weighted F1-score of 0.55 on the test dataset and secured the first rank among the participated teams.
SOA_NLP@LT-EDI-ACL2022: An Ensemble Model for Hope Speech Detection from YouTube Comments
Abhinav Kumar
|
Sunil Saumya
|
Pradeep Roy
Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion
Language should be accommodating of equality and diversity as a fundamental aspect of communication. The language of internet users has a big impact on peer users all over the world. On virtual platforms such as Facebook, Twitter, and YouTube, people express their opinions in different languages. People respect others’ accomplishments, pray for their well-being, and cheer them on when they fail. Such motivational remarks are hope speech remarks. Simultaneously, a group of users encourages discrimination against women, people of color, people with disabilities, and other minorities based on gender, race, sexual orientation, and other factors. To recognize hope speech from YouTube comments, the current study offers an ensemble approach that combines a support vector machine, logistic regression, and random forest classifiers. Extensive testing was carried out to discover the best features for the aforementioned classifiers. In the support vector machine and logistic regression classifiers, char-level TF-IDF features were used, whereas in the random forest classifier, word-level features were used. The proposed ensemble model performed significantly well among English, Spanish, Tamil, Malayalam, and Kannada YouTube comments.
Search