José Oropeza

2024

pdf bib abs
Social Media Fake News Classification Using Machine Learning Algorithm
Girma Bade | Olga Kolesnikova | Grigori Sidorov | José Oropeza
Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

The rise of social media has facilitated easier communication, information sharing, and current affairs updates. However, the prevalence of misleading and deceptive content, commonly referred to as fake news, poses a significant challenge. This paper focuses on the classification of fake news in Malayalam, a Dravidian language, utilizing natural language processing (NLP) techniques. To develop a model, we employed a random forest machine learning method on a dataset provided by a shared task(DravidianLangTech@EACL 2024)1. When evaluated by the separate test dataset, our developed model achieved a 0.71 macro F1 measure.

pdf bib abs
Social Media Hate and Offensive Speech Detection Using Machine Learning method
Girma Bade | Olga Kolesnikova | Grigori Sidorov | José Oropeza
Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

Even though the improper use of social media is increasing nowadays, there is also technology that brings solutions. Here, improperness is posting hate and offensive speech that might harm an individual or group. Hate speech refers to an insult toward an individual or group based on their identities. Spreading it on social media platforms is a serious problem for society. The solution, on the other hand, is the availability of natural language processing(NLP) technology that is capable to detect and handle such problems. This paper presents the detection of social media’s hate and offensive speech in the code-mixed Telugu language. For this, the task and golden standard dataset were provided for us by the shared task organizer (DravidianLangTech@ EACL 2024)1. To this end, we have employed the TF-IDF technique for numeric feature extraction and used a random forest algorithm for modeling hate speech detection. Finally, the developed model was evaluated on the test dataset and achieved 0.492 macro-F1.

José Oropeza

2024

Co-authors

Venues