Language Independent Sentiment Analysis with Sentiment-Specific Word Embeddings

Carl Saroufim; Akram Almatarky; Mohammad Abdel Hady

doi:10.18653/v1/W18-6204

Language Independent Sentiment Analysis with Sentiment-Specific Word Embeddings

Carl Saroufim, Akram Almatarky, Mohammad Abdel Hady

Abstract

Data annotation is a critical step to train a text model but it is tedious, expensive and time-consuming. We present a language independent method to train a sentiment polarity model with limited amount of manually-labeled data. Word embeddings such as Word2Vec are efficient at incorporating semantic and syntactic properties of words, yielding good results for document classification. However, these embeddings might map words with opposite polarities, to vectors close to each other. We train Sentiment Specific Word Embeddings (SSWE) on top of an unsupervised Word2Vec model, using either Recurrent Neural Networks (RNN) or Convolutional Neural Networks (CNN) on data auto-labeled as “Positive” or “Negative”. For this task, we rely on the universality of emojis and emoticons to auto-label a large number of French tweets using a small set of positive and negative emojis and emoticons. Finally, we apply a transfer learning approach to refine the network weights with a small-size manually-labeled training data set. Experiments are conducted to evaluate the performance of this approach on French sentiment classification using benchmark data sets from SemEval 2016 competition. We were able to achieve a performance improvement by using SSWE over Word2Vec. We also used a graph-based approach for label propagation to auto-generate a sentiment lexicon.

Anthology ID:: W18-6204
Volume:: Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis
Month:: October
Year:: 2018
Address:: Brussels, Belgium
Editors:: Alexandra Balahur, Saif M. Mohammad, Veronique Hoste, Roman Klinger
Venue:: WASSA
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 14–23
Language:
URL:: https://aclanthology.org/W18-6204/
DOI:: 10.18653/v1/W18-6204
Bibkey:
Cite (ACL):: Carl Saroufim, Akram Almatarky, and Mohammad Abdel Hady. 2018. Language Independent Sentiment Analysis with Sentiment-Specific Word Embeddings. In Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pages 14–23, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):: Language Independent Sentiment Analysis with Sentiment-Specific Word Embeddings (Saroufim et al., WASSA 2018)
Copy Citation:
PDF:: https://aclanthology.org/W18-6204.pdf

PDF Cite Search Fix data