Toward Qualitative Evaluation of Embeddings for Arabic Sentiment Analysis

Amira Barhoumi, Nathalie Camelin, Chafik Aloulou, Yannick Estève, Lamia Hadrich Belguith


Abstract
In this paper, we propose several protocols to evaluate specific embeddings for Arabic sentiment analysis (SA) task. In fact, Arabic language is characterized by its agglutination and morphological richness contributing to great sparsity that could affect embedding quality. This work presents a study that compares embeddings based on words and lemmas in SA frame. We propose first to study the evolution of embedding models trained with different types of corpora (polar and non polar) and explore the variation between embeddings by observing the sentiment stability of neighbors in embedding spaces. Then, we evaluate embeddings with a neural architecture based on convolutional neural network (CNN). We make available our pre-trained embeddings to Arabic NLP research community with free to use. We provide also for free resources used to evaluate our embeddings. Experiments are done on the Large Arabic-Book Reviews (LABR) corpus in binary (positive/negative) classification frame. Our best result reaches 91.9%, that is higher than the best previous published one (91.5%).
Anthology ID:
2020.lrec-1.610
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
4955–4963
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.610
DOI:
Bibkey:
Cite (ACL):
Amira Barhoumi, Nathalie Camelin, Chafik Aloulou, Yannick Estève, and Lamia Hadrich Belguith. 2020. Toward Qualitative Evaluation of Embeddings for Arabic Sentiment Analysis. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 4955–4963, Marseille, France. European Language Resources Association.
Cite (Informal):
Toward Qualitative Evaluation of Embeddings for Arabic Sentiment Analysis (Barhoumi et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.610.pdf