NileULex: A Phrase and Word Level Sentiment Lexicon for Egyptian and Modern Standard Arabic

Samhaa R. El-Beltagy


Abstract
This paper presents NileULex, which is an Arabic sentiment lexicon containing close to six thousands Arabic words and compound phrases. Forty five percent of the terms and expressions in the lexicon are Egyptian or colloquial while fifty five percent are Modern Standard Arabic. While the collection of many of the terms included in the lexicon was done automatically, the actual addition of any term was done manually. One of the important criterions for adding terms to the lexicon, was that they be as unambiguous as possible. The result is a lexicon with a much higher quality than any translated variant or automatically constructed one. To demonstrate that a lexicon such as this can directly impact the task of sentiment analysis, a very basic machine learning based sentiment analyser that uses unigrams, bigrams, and lexicon based features was applied on two different Twitter datasets. The obtained results were compared to a baseline system that only uses unigrams and bigrams. The same lexicon based features were also generated using a publicly available translation of a popular sentiment lexicon. The experiments show that usage of the developed lexicon improves the results over both the baseline and the publicly available lexicon.
Anthology ID:
L16-1463
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
2900–2905
Language:
URL:
https://aclanthology.org/L16-1463
DOI:
Bibkey:
Cite (ACL):
Samhaa R. El-Beltagy. 2016. NileULex: A Phrase and Word Level Sentiment Lexicon for Egyptian and Modern Standard Arabic. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 2900–2905, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
NileULex: A Phrase and Word Level Sentiment Lexicon for Egyptian and Modern Standard Arabic (El-Beltagy, LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1463.pdf