K. Salas-Jimenez


2024

pdf bib
WikiBias as an Extrapolation Corpus for Bias Detection
K. Salas-Jimenez | Francisco Fernando Lopez-Ponce | Sergio-Luis Ojeda-Trueba | Gemma Bel-Enguix
Proceedings of the First Workshop on Advancing Natural Language Processing for Wikipedia

This paper explores whether it is possible to train a machine learning model using Wikipedia data to detect subjectivity in sentences and generalize effectively to other domains. To achieve this, we performed experiments with the WikiBias corpus, the BABE corpus, and the CheckThat! Dataset. Various classical models for ML were tested, including Logistic Regression, SVC, and SVR, including characteristics such as Sentence Transformers similarity, probabilistic sentiment measures, and biased lexicons. Pre-trained models like DistilRoBERTa, as well as large language models like Gemma and GPT-4, were also tested for the same classification task.