Enhancing Deep Learning with Embedded Features for Arabic Named Entity Recognition

Ali L. Hatab; Caroline Sabty; Slim Abdennadher

Enhancing Deep Learning with Embedded Features for Arabic Named Entity Recognition

Ali L. Hatab, Caroline Sabty, Slim Abdennadher

Abstract

The introduction of word embedding models has remarkably changed many Natural Language Processing tasks. Word embeddings can automatically capture the semantics of words and other hidden features. Nonetheless, the Arabic language is highly complex, which results in the loss of important information. This paper uses Madamira, an external knowledge source, to generate additional word features. We evaluate the utility of adding these features to conventional word and character embeddings to perform the Named Entity Recognition (NER) task on Modern Standard Arabic (MSA). Our NER model is implemented using Bidirectional Long Short Term Memory and Conditional Random Fields (BiLSTM-CRF). We add morphological and syntactical features to different word embeddings to train the model. The added features improve the performance by different values depending on the used embedding model. The best performance is achieved by using Bert embeddings. Moreover, our best model outperforms the previous systems to the best of our knowledge.

Anthology ID:: 2022.lrec-1.524
Volume:: Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:: June
Year:: 2022
Address:: Marseille, France
Editors:: Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:: LREC
SIG:
Publisher:: European Language Resources Association
Note:
Pages:: 4904–4912
Language:
URL:: https://aclanthology.org/2022.lrec-1.524/
DOI:
Bibkey:
Cite (ACL):: Ali L. Hatab, Caroline Sabty, and Slim Abdennadher. 2022. Enhancing Deep Learning with Embedded Features for Arabic Named Entity Recognition. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 4904–4912, Marseille, France. European Language Resources Association.
Cite (Informal):: Enhancing Deep Learning with Embedded Features for Arabic Named Entity Recognition (Hatab et al., LREC 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.lrec-1.524.pdf

PDF Cite Search Fix data