Hilal AlQuabeh
2024
SAFARI: Cross-lingual Bias and Factuality Detection in News Media and News Articles
Dilshod Azizov
|
Zain Muhammad Mujahid
|
Hilal AlQuabeh
|
Preslav Nakov
|
Shangsong Liang
Findings of the Association for Computational Linguistics: EMNLP 2024
In an era where information is quickly shared across many cultural and language contexts, the neutrality and integrity of news media are essential. Ensuring that media content remains unbiased and factual is crucial for maintaining public trust. With this in mind, we introduce SAFARI (CroSs-lingual BiAs and Factuality Detection in News MediA and News ARtIcles), a novel corpus of news media and articles for predicting political bias and the factuality of reporting in a multilingual and cross-lingual setup. To the best of our knowledge, this corpus is unprecedented in its collection and introduces a dataset for political bias and factuality for three tasks: (i) media-level, (ii) article-level, and (iii) joint modeling at the article-level. At the media and article levels, we evaluate the cross-lingual ability of the models; however, in joint modeling, we evaluate on English data. Our frameworks set a new benchmark in the cross-lingual evaluation of political bias and factuality. This is achieved through the use of various Multilingual Pre-trained Language Models (MPLMs) and Large Language Models (LLMs) coupled with ensemble learning methods.
2023
Lotus at WojoodNER Shared Task: Multilingual Transformers: Unveiling Flat and Nested Entity Recognition
Jiyong Li
|
Dilshod Azizov
|
Hilal AlQuabeh
|
Shangsong Liang
Proceedings of ArabicNLP 2023
We introduce our systems developed for two subtasks in the shared task “Wojood” on Arabic NER detection, part of ArabicNLP 2023. For Subtask 1, we employ the XLM-R model to predict Flat NER labels for given tokens using a single classifier capable of categorizing all labels. For Subtask 2, we use the XLM-R encoder by building 21 individual classifiers. Each classifier corresponds to a specific label and is designed to determine the presence of its respective label. In terms of performance, our systems achieved competitive micro-F1 scores of 0.83 for Subtask 1 and 0.76 for Subtask 2, according to the leaderboard scores.