A Benchmark and Scoring Algorithm for Enriching Arabic Synonyms

Sana Ghanem; Mustafa Jarrar; Radi Jarrar; Ibrahim Bounhas

doi:10.18653/v1/2023.gwc-1.34

A Benchmark and Scoring Algorithm for Enriching Arabic Synonyms

Sana Ghanem, Mustafa Jarrar, Radi Jarrar, Ibrahim Bounhas

Abstract

This paper addresses the task of extending a given synset with additional synonyms taking into account synonymy strength as a fuzzy value. Given a mono/multilingual synset and a threshold (a fuzzy value [0−1]), our goal is to extract new synonyms above this threshold from existing lexicons. We present twofold contributions: an algorithm and a benchmark dataset. The dataset consists of 3K candidate synonyms for 500 synsets. Each candidate synonym is annotated with a fuzzy value by four linguists. The dataset is important for (i) understanding how much linguists (dis/)agree on synonymy, in addition to (ii) using the dataset as a baseline to evaluate our algorithm. Our proposed algorithm extracts synonyms from existing lexicons and computes a fuzzy value for each candidate. Our evaluations show that the algorithm behaves like a linguist and its fuzzy values are close to those proposed by linguists (using RMSE and MAE). The dataset and a demo page are publicly available at https://portal.sina.birzeit.edu/synonyms.

Anthology ID:: 2023.gwc-1.34
Volume:: Proceedings of the 12th Global Wordnet Conference
Month:: January
Year:: 2023
Address:: University of the Basque Country, Donostia - San Sebastian, Basque Country
Editors:: German Rigau, Francis Bond, Alexandre Rademaker
Venue:: GWC
SIG:: SIGLEX
Publisher:: Global Wordnet Association
Note:
Pages:: 274–283
Language:
URL:: https://aclanthology.org/2023.gwc-1.34/
DOI:: 10.18653/v1/2023.gwc-1.34
Bibkey:
Cite (ACL):: Sana Ghanem, Mustafa Jarrar, Radi Jarrar, and Ibrahim Bounhas. 2023. A Benchmark and Scoring Algorithm for Enriching Arabic Synonyms. In Proceedings of the 12th Global Wordnet Conference, pages 274–283, University of the Basque Country, Donostia - San Sebastian, Basque Country. Global Wordnet Association.
Cite (Informal):: A Benchmark and Scoring Algorithm for Enriching Arabic Synonyms (Ghanem et al., GWC 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.gwc-1.34.pdf

PDF Cite Search Fix data