Ron Keinan


2024

In this paper, I describe my submission to the SemEval-2024 contest. I tackled subtask 1 - “Semantic Textual Relatedness for African and Asian Languages”. To find the semantic relatedness of sentence pairs, I tackled this task by creating models for nine different languages. I then vectorized the text data using a variety of embedding techniques including doc2vec, tf-idf, Sentence-Transformers, Bert, Roberta, and more, and used 11 traditional machine learning techniques of the regression type for analysis and evaluation.

2023

In this paper, we describe our submissions to the SemEval-2023 contest. We tackled subtask 12 - “AfriSenti-SemEval: Sentiment Analysis for Low-resource African Languages using Twitter Dataset”. We developed different models for 12 African languages and a 13th model for a multilingual dataset built from these 12 languages. We applied a wide variety of word and char n-grams based on their tf-idf values, 4 classical machine learning methods, 2 deep learning methods, and 3 oversampling methods. We used 12 sentiment lexicons and applied extensive hyperparameter tuning.