Abed Alhakim Freihat

2024

Advancing the Arabic WordNet: Elevating Content Quality
Abed Alhakim Freihat | Hadi Mahmoud Khalilia | Gábor Bella | Fausto Giunchiglia
Proceedings of the 6th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT) with Shared Tasks on Arabic LLMs Hallucination and Dialect to MSA Machine Translation @ LREC-COLING 2024

High-quality WordNets are crucial for achieving high-quality results in NLP applications that rely on such resources. However, the wordnets of most languages suffer from serious issues of correctness and completeness with respect to the words and word meanings they define, such as incorrect lemmas, missing glosses and example sentences, or an inadequate, Western-centric representation of the morphology and the semantics of the language. Previous efforts have largely focused on increasing lexical coverage while ignoring other qualitative aspects. In this paper, we focus on the Arabic language and introduce a major revision of the Arabic WordNet that addresses multiple dimensions of lexico-semantic resource quality. As a result, we updated more than 58% of the synsets of the existing Arabic WordNet by adding missing information and correcting errors. In order to address issues of language diversity and untranslatability, we also extended the wordnet structure by new elements: phrasets and lexical gaps.

pdf bib

Proceedings of the 7th International Conference on Natural Language and Speech Processing (ICNLSP 2024)
Mourad Abbas | Abed Alhakim Freihat
Proceedings of the 7th International Conference on Natural Language and Speech Processing (ICNLSP 2024)

2023

pdf bib

Proceedings of the 6th International Conference on Natural Language and Speech Processing (ICNLSP 2023)
Mourad Abbas | Abed Alhakim Freihat
Proceedings of the 6th International Conference on Natural Language and Speech Processing (ICNLSP 2023)

2022

pdf bib

Proceedings of the Third International Workshop on NLP Solutions for Under Resourced Languages (NSURL 2022) co-located with ICNLSP 2022
Abed Alhakim Freihat | Mourad Abbas
Proceedings of the Third International Workshop on NLP Solutions for Under Resourced Languages (NSURL 2022) co-located with ICNLSP 2022

pdf bib abs

This paper describes a method to enrich lexical resources with content relating to linguistic diversity, based on knowledge from the field of lexical typology. We capture the phenomenon of diversity through the notion of lexical gap and use a systematic method to infer gaps semi-automatically on a large scale, which we demonstrate on the kinship domain. The resulting free diversity-aware terminological resource consists of 198 concepts, 1,911 words, and 37,370 gaps in 699 languages. We see great potential in the use of resources such as ours for the improvement of a variety of cross-lingual NLP tasks, which we illustrate through an application in the evaluation of machine translation systems.

pdf bib

ALRT: Cutting Edge Tool for Automatic Generation of Arabic Lexical Recognition Tests
Osama Hamed | Saeed Salah | Abed Alhakim Freihat
Proceedings of the Third International Workshop on NLP Solutions for Under Resourced Languages (NSURL 2022) co-located with ICNLSP 2022

pdf bib

Proceedings of the 5th International Conference on Natural Language and Speech Processing (ICNLSP 2022)
Mourad Abbas | Abed Alhakim Freihat
Proceedings of the 5th International Conference on Natural Language and Speech Processing (ICNLSP 2022)

2021

pdf bib

The Dimensions of Lexical Semantic Resource Quality
Hadi Khalilia | Abed Alhakim Freihat | Fausto Giunchiglia
Proceedings of the Second International Workshop on NLP Solutions for Under Resourced Languages (NSURL 2021) co-located with ICNLSP 2021

pdf bib abs

The emergence of Multi-task learning (MTL)models in recent years has helped push thestate of the art in Natural Language Un-derstanding (NLU). We strongly believe thatmany NLU problems in Arabic are especiallypoised to reap the benefits of such models. Tothis end we propose the Arabic Language Un-derstanding Evaluation Benchmark (ALUE),based on 8 carefully selected and previouslypublished tasks. For five of these, we providenew privately held evaluation datasets to en-sure the fairness and validity of our benchmark. We also provide a diagnostic dataset to helpresearchers probe the inner workings of theirmodels.Our initial experiments show thatMTL models outperform their singly trainedcounterparts on most tasks. But in order to en-tice participation from the wider community,we stick to publishing singly trained baselinesonly. Nonetheless, our analysis reveals thatthere is plenty of room for improvement inArabic NLU. We hope that ALUE will playa part in helping our community realize someof these improvements. Interested researchersare invited to submit their results to our online,and publicly accessible leaderboard.

pdf bib

Proceedings of the Second International Workshop on NLP Solutions for Under Resourced Languages (NSURL 2021) co-located with ICNLSP 2021
Abed Alhakim Freihat | Mourad Abbas
Proceedings of the Second International Workshop on NLP Solutions for Under Resourced Languages (NSURL 2021) co-located with ICNLSP 2021

pdf bib

Proceedings of the 4th International Conference on Natural Language and Speech Processing (ICNLSP 2021)
Mourad Abbas | Abed Alhakim Freihat
Proceedings of the 4th International Conference on Natural Language and Speech Processing (ICNLSP 2021)

pdf bib

The Quality of Lexical Semantic Resources: A Survey
Hadi Khalilia | Abed Alhakim Freihat | Fausto Giunchiglia
Proceedings of the 4th International Conference on Natural Language and Speech Processing (ICNLSP 2021)

2020

pdf bib abs

We present a new wordnet resource for Scottish Gaelic, a Celtic minority language spoken by about 60,000 speakers, most of whom live in Northwestern Scotland. The wordnet contains over 15 thousand word senses and was constructed by merging ten thousand new, high-quality translations, provided and validated by language experts, with an existing wordnet derived from Wiktionary. This new, considerably extended wordnet—currently among the 30 largest in the world—targets multiple communities: language speakers and learners; linguists; computer scientists solving problems related to natural language processing. By publishing it as a freely downloadable resource, we hope to contribute to the long-term preservation of Scottish Gaelic as a living language, both offline and on the Web.

2019

pdf bib

Proceedings of the First International Workshop on NLP Solutions for Under Resourced Languages (NSURL 2019) co-located with ICNLSP 2019 - Short Papers
Abed Alhakim Freihat | Mourad Abbas
Proceedings of the First International Workshop on NLP Solutions for Under Resourced Languages (NSURL 2019) co-located with ICNLSP 2019 - Short Papers

pdf bib

ST NSURL 2019 Shared Task: Semantic Question Similarity in Arabic
Mohamed Lichouri | Mourad Abbas | Besma Benaziz | Abed Alhakim Freihat
Proceedings of the First International Workshop on NLP Solutions for Under Resourced Languages (NSURL 2019) co-located with ICNLSP 2019 - Short Papers

pdf bib

Proceedings of the 3rd International Conference on Natural Language and Speech Processing
Mourad Abbas | Abed Alhakim Freihat
Proceedings of the 3rd International Conference on Natural Language and Speech Processing

pdf bib abs

In this paper we discuss several models we used to classify 25 city-level Arabic dialects in addition to Modern Standard Arabic (MSA) as part of MADAR shared task (sub-task 1). We propose an ensemble model of a group of experimentally designed best performing classifiers on a various set of features. Our system achieves an accuracy of 69.3% macro F1-score with an improvement of 1.4% accuracy from the baseline model on the DEV dataset. Our best run submitted model ranked as third out of 19 participating teams on the TEST dataset with only 0.12% macro F1-score behind the top ranked system.

pdf bib abs

ST MADAR 2019 Shared Task: Arabic Fine-Grained Dialect Identification
Mourad Abbas | Mohamed Lichouri | Abed Alhakim Freihat
Proceedings of the Fourth Arabic Natural Language Processing Workshop

This paper describes the solution that we propose on MADAR 2019 Arabic Fine-Grained Dialect Identification task. The proposed solution utilized a set of classifiers that we trained on character and word features. These classifiers are: Support Vector Machines (SVM), Bernoulli Naive Bayes (BNB), Multinomial Naive Bayes (MNB), Logistic Regression (LR), Stochastic Gradient Descent (SGD), Passive Aggressive(PA) and Perceptron (PC). The system achieved competitive results, with a performance of 62.87 % and 62.12 % for both development and test sets.

2017

pdf bib abs

TrentoTeam at SemEval-2017 Task 3: An application of Grice Maxims in Ranking Community Question Answers
Mohammed R. H. Qwaider | Abed Alhakim Freihat | Fausto Giunchiglia
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

In this paper we present the Tren-toTeam system which participated to thetask 3 at SemEval-2017 (Nakov et al.,2017).We concentrated our work onapplying Grice Maxims(used in manystate-of-the-art Machine learning applica-tions(Vogel et al., 2013; Kheirabadiand Aghagolzadeh, 2012; Dale and Re-iter, 1995; Franke, 2011)) to ranking an-swers of a question by answers relevancy. Particularly, we created a ranker systembased on relevancy scores, assigned by 3main components: Named entity recogni-tion, similarity score, sentiment analysis. Our system obtained a comparable resultsto Machine learning systems.

2016

pdf bib abs

A Taxonomic Classification of WordNet Polysemy Types
Abed Alhakim Freihat | Fausto Giunchiglia | Biswanath Dutta
Proceedings of the 8th Global WordNet Conference (GWC)

WordNet represents polysemous terms by capturing the different meanings of these terms at the lexical level, but without giving emphasis on the polysemy types such terms belong to. The state of the art polysemy approaches identify several polysemy types in WordNet but they do not explain how to classify and organize them. In this paper, we present a novel approach for classifying the polysemy types which exploits taxonomic principles which in turn, allow us to discover a set of polysemy structural patterns.

pdf bib