Explicit Fine grained Syntactic and Semantic Annotation of the Idafa Construction in Arabic

Abdelati Hawwari, Mohammed Attia, Mahmoud Ghoneim, Mona Diab


Abstract
Idafa in traditional Arabic grammar is an umbrella construction that covers several phenomena including what is expressed in English as noun-noun compounds and Saxon and Norman genitives. Additionally, Idafa participates in some other constructions, such as quantifiers, quasi-prepositions, and adjectives. Identifying the various types of the Idafa construction (IC) is of importance to Natural Language processing (NLP) applications. Noun-Noun compounds exhibit special behavior in most languages impacting their semantic interpretation. Hence distinguishing them could have an impact on downstream NLP applications. The most comprehensive syntactic representation of the Arabic language is the LDC Arabic Treebank (ATB). In the ATB, ICs are not explicitly labeled and furthermore, there is no distinction between ICs of noun-noun relations and other traditional ICs. Hence, we devise a detailed syntactic and semantic typification process of the IC phenomenon in Arabic. We target the ATB as a platform for this classification. We render the ATB annotated with explicit IC labels but with the further semantic characterization which is useful for syntactic, semantic and cross language processing. Our typification of IC comprises 3 main syntactic IC types: FIC, GIC, and TIC, and they are further divided into 10 syntactic subclasses. The TIC group is further classified into semantic relations. We devise a method for automatic IC labeling and compare its yield against the CATiB treebank. Our evaluation shows that we achieve the same level of accuracy, but with the additional fine-grained classification into the various syntactic and semantic types.
Anthology ID:
L16-1567
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3569–3577
Language:
URL:
https://aclanthology.org/L16-1567
DOI:
Bibkey:
Cite (ACL):
Abdelati Hawwari, Mohammed Attia, Mahmoud Ghoneim, and Mona Diab. 2016. Explicit Fine grained Syntactic and Semantic Annotation of the Idafa Construction in Arabic. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 3569–3577, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Explicit Fine grained Syntactic and Semantic Annotation of the Idafa Construction in Arabic (Hawwari et al., LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1567.pdf