Developing a Tag-Set and Extracting the Morphological Lexicons to Build a Morphological Analyzer for Egyptian Arabic

Amany Fashwan, Sameh Alansary


Abstract
This paper sheds light on an in-progress work for building a morphological analyzer for Egyptian Arabic (EGY). To build such a tool, a tag-set schema is developed depending on a corpus of 527,000 EGY words covering different sources and genres. This tag-set schema is used in annotating about 318,940 words, morphologically, according to their contexts. Each annotated word is associated with its suitable prefix(s), original stem, tag, suffix(s), glossary, number, gender, definiteness, and conventional lemma and stem. These morphologically annotated words, in turns, are used in developing the proposed morphological analyzer where the morphological lexicons and the compatibility tables are extracted and tested. The system is compared with one of best EGY morphological analyzers; CALIMA.
Anthology ID:
2022.wanlp-1.14
Volume:
Proceedings of the Seventh Arabic Natural Language Processing Workshop (WANLP)
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates (Hybrid)
Editors:
Houda Bouamor, Hend Al-Khalifa, Kareem Darwish, Owen Rambow, Fethi Bougares, Ahmed Abdelali, Nadi Tomeh, Salam Khalifa, Wajdi Zaghouani
Venue:
WANLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
142–160
Language:
URL:
https://aclanthology.org/2022.wanlp-1.14
DOI:
10.18653/v1/2022.wanlp-1.14
Bibkey:
Cite (ACL):
Amany Fashwan and Sameh Alansary. 2022. Developing a Tag-Set and Extracting the Morphological Lexicons to Build a Morphological Analyzer for Egyptian Arabic. In Proceedings of the Seventh Arabic Natural Language Processing Workshop (WANLP), pages 142–160, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):
Developing a Tag-Set and Extracting the Morphological Lexicons to Build a Morphological Analyzer for Egyptian Arabic (Fashwan & Alansary, WANLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.wanlp-1.14.pdf