Anas Shahrour
2016
Exploiting Arabic Diacritization for High Quality Automatic Annotation
Nizar Habash
|
Anas Shahrour
|
Muhamed Al-Khalil
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
We present a novel technique for Arabic morphological annotation. The technique utilizes diacritization to produce morphological annotations of quality comparable to human annotators. Although Arabic text is generally written without diacritics, diacritization is already available for large corpora of Arabic text in several genres. Furthermore, diacritization can be generated at a low cost for new text as it does not require specialized training beyond what educated Arabic typists know. The basic approach is to enrich the input to a state-of-the-art Arabic morphological analyzer with word diacritics (full or partial) to enhance its performance. When applied to fully diacritized text, our approach produces annotations with an accuracy of over 97% on lemma, part-of-speech, and tokenization combined.
CamelParser: A system for Arabic Syntactic Analysis and Morphological Disambiguation
Anas Shahrour
|
Salam Khalifa
|
Dima Taji
|
Nizar Habash
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations
In this paper, we present CamelParser, a state-of-the-art system for Arabic syntactic dependency analysis aligned with contextually disambiguated morphological features. CamelParser uses a state-of-the-art morphological disambiguator and improves its results using syntactically driven features. The system offers a number of output formats that include basic dependency with morphological features, two tree visualization modes, and traditional Arabic grammatical analysis.
2015
Improving Arabic Diacritization through Syntactic Analysis
Anas Shahrour
|
Salam Khalifa
|
Nizar Habash
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
Search