Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

Alexandra Balahur, Saif M. Mohammad, Veronique Hoste, Roman Klinger (Editors)

Anthology ID:
Brussels, Belgium
Association for Computational Linguistics
Bib Export formats:

pdf bib
Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis
Alexandra Balahur | Saif M. Mohammad | Veronique Hoste | Roman Klinger

pdf bib
Identifying Affective Events and the Reasons for their Polarity
Ellen Riloff

Many events have a positive or negative impact on our lives (e.g., “I bought a house” is typically good news, but ”My house burned down” is bad news). Recognizing events that have affective polarity is essential for narrative text understanding, conversational dialogue, and applications such as summarization and sarcasm detection. We will discuss our recent work on identifying affective events and categorizing them based on the underlying reasons for their affective polarity. First, we will describe a weakly supervised learning method to induce a large set of affective events from a text corpus by optimizing for semantic consistency. Second, we will present models to classify affective events based on Human Need Categories, which often explain people’s motivations and desires. Our best results use a co-training model that consists of event expression and event context classifiers and exploits both labeled and unlabeled texts. We will conclude with a discussion of interesting directions for future work in this area.

pdf bib
Deep contextualized word representations for detecting sarcasm and irony
Suzana Ilić | Edison Marrese-Taylor | Jorge Balazs | Yutaka Matsuo

Predicting context-dependent and non-literal utterances like sarcastic and ironic expressions still remains a challenging task in NLP, as it goes beyond linguistic patterns, encompassing common sense and shared knowledge as crucial components. To capture complex morpho-syntactic features that can usually serve as indicators for irony or sarcasm across dynamic contexts, we propose a model that uses character-level vector representations of words, based on ELMo. We test our model on 7 different datasets derived from 3 different data sources, providing state-of-the-art performance in 6 of them, and otherwise offering competitive results.

pdf bib
Implicit Subjective and Sentimental Usages in Multi-sense Word Embeddings
Yuqi Sun | Haoyue Shi | Junfeng Hu

In multi-sense word embeddings, contextual variations in corpus may cause a univocal word to be embedded into different sense vectors. Shi et al. (2016) show that this kind of pseudo multi-senses can be eliminated by linear transformations. In this paper, we show that pseudo multi-senses may come from a uniform and meaningful phenomenon such as subjective and sentimental usage, though they are seemingly redundant. In this paper, we present an unsupervised algorithm to find a linear transformation which can minimize the transformed distance of a group of sense pairs. The major shrinking direction of this transformation is found to be related with subjective shift. Therefore, we can not only eliminate pseudo multi-senses in multisense embeddings, but also identify these subjective senses and tag the subjective and sentimental usage of words in the corpus automatically.

pdf bib
Language Independent Sentiment Analysis with Sentiment-Specific Word Embeddings
Carl Saroufim | Akram Almatarky | Mohammad Abdel Hady

Data annotation is a critical step to train a text model but it is tedious, expensive and time-consuming. We present a language independent method to train a sentiment polarity model with limited amount of manually-labeled data. Word embeddings such as Word2Vec are efficient at incorporating semantic and syntactic properties of words, yielding good results for document classification. However, these embeddings might map words with opposite polarities, to vectors close to each other. We train Sentiment Specific Word Embeddings (SSWE) on top of an unsupervised Word2Vec model, using either Recurrent Neural Networks (RNN) or Convolutional Neural Networks (CNN) on data auto-labeled as “Positive” or “Negative”. For this task, we rely on the universality of emojis and emoticons to auto-label a large number of French tweets using a small set of positive and negative emojis and emoticons. Finally, we apply a transfer learning approach to refine the network weights with a small-size manually-labeled training data set. Experiments are conducted to evaluate the performance of this approach on French sentiment classification using benchmark data sets from SemEval 2016 competition. We were able to achieve a performance improvement by using SSWE over Word2Vec. We also used a graph-based approach for label propagation to auto-generate a sentiment lexicon.

pdf bib
Creating a Dataset for Multilingual Fine-grained Emotion-detection Using Gamification-based Annotation
Emily Öhman | Kaisla Kajava | Jörg Tiedemann | Timo Honkela

This paper introduces a gamified framework for fine-grained sentiment analysis and emotion detection. We present a flexible tool, Sentimentator, that can be used for efficient annotation based on crowd sourcing and a self-perpetuating gold standard. We also present a novel dataset with multi-dimensional annotations of emotions and sentiments in movie subtitles that enables research on sentiment preservation across languages and the creation of robust multilingual emotion detection tools. The tools and datasets are public and open-source and can easily be extended and applied for various purposes.

pdf bib
IEST: WASSA-2018 Implicit Emotions Shared Task
Roman Klinger | Orphée De Clercq | Saif Mohammad | Alexandra Balahur

Past shared tasks on emotions use data with both overt expressions of emotions (I am so happy to see you!) as well as subtle expressions where the emotions have to be inferred, for instance from event descriptions. Further, most datasets do not focus on the cause or the stimulus of the emotion. Here, for the first time, we propose a shared task where systems have to predict the emotions in a large automatically labeled dataset of tweets without access to words denoting emotions. Based on this intention, we call this the Implicit Emotion Shared Task (IEST) because the systems have to infer the emotion mostly from the context. Every tweet has an occurrence of an explicit emotion word that is masked. The tweets are collected in a manner such that they are likely to include a description of the cause of the emotion – the stimulus. Altogether, 30 teams submitted results which range from macro F1 scores of 21 % to 71 %. The baseline (Max-Ent bag of words and bigrams) obtains an F1 score of 60 % which was available to the participants during the development phase. A study with human annotators suggests that automatic methods outperform human predictions, possibly by honing into subtle textual clues not used by humans. Corpora, resources, and results are available at the shared task website at

pdf bib
Amobee at IEST 2018: Transfer Learning from Language Models
Alon Rozental | Daniel Fleischer | Zohar Kelrich

This paper describes the system developed at Amobee for the WASSA 2018 implicit emotions shared task (IEST). The goal of this task was to predict the emotion expressed by missing words in tweets without an explicit mention of those words. We developed an ensemble system consisting of language models together with LSTM-based networks containing a CNN attention mechanism. Our approach represents a novel use of language models—specifically trained on a large Twitter dataset—to predict and classify emotions. Our system reached 1st place with a macro F1 score of 0.7145.

pdf bib
IIIDYT at IEST 2018: Implicit Emotion Classification With Deep Contextualized Word Representations
Jorge Balazs | Edison Marrese-Taylor | Yutaka Matsuo

In this paper we describe our system designed for the WASSA 2018 Implicit Emotion Shared Task (IEST), which obtained 2nd place out of 30 teams with a test macro F1 score of 0.710. The system is composed of a single pre-trained ELMo layer for encoding words, a Bidirectional Long-Short Memory Network BiLSTM for enriching word representations with context, a max-pooling operation for creating sentence representations from them, and a Dense Layer for projecting the sentence representations into label space. Our official submission was obtained by ensembling 6 of these models initialized with different random seeds. The code for replicating this paper is available at

pdf bib
NTUA-SLP at IEST 2018: Ensemble of Neural Transfer Methods for Implicit Emotion Classification
Alexandra Chronopoulou | Aikaterini Margatina | Christos Baziotis | Alexandros Potamianos

In this paper we present our approach to tackle the Implicit Emotion Shared Task (IEST) organized as part of WASSA 2018 at EMNLP 2018. Given a tweet, from which a certain word has been removed, we are asked to predict the emotion of the missing word. In this work, we experiment with neural Transfer Learning (TL) methods. Our models are based on LSTM networks, augmented with a self-attention mechanism. We use the weights of various pretrained models, for initializing specific layers of our networks. We leverage a big collection of unlabeled Twitter messages, for pretraining word2vec word embeddings and a set of diverse language models. Moreover, we utilize a sentiment analysis dataset for pretraining a model, which encodes emotion related information. The submitted model consists of an ensemble of the aforementioned TL models. Our team ranked 3rd out of 30 participants, achieving an F1 score of 0.703.

pdf bib
Sentiment analysis under temporal shift
Jan Lukes | Anders Søgaard

Sentiment analysis models often rely on training data that is several years old. In this paper, we show that lexical features change polarity over time, leading to degrading performance. This effect is particularly strong in sparse models relying only on highly predictive features. Using predictive feature selection, we are able to significantly improve the accuracy of such models over time.

pdf bib
Not Just Depressed: Bipolar Disorder Prediction on Reddit
Ivan Sekulic | Matej Gjurković | Jan Šnajder

Bipolar disorder, an illness characterized by manic and depressive episodes, affects more than 60 million people worldwide. We present a preliminary study on bipolar disorder prediction from user-generated text on Reddit, which relies on users’ self-reported labels. Our benchmark classifiers for bipolar disorder prediction outperform the baselines and reach accuracy and F1-scores of above 86%. Feature analysis shows interesting differences in language use between users with bipolar disorders and the control group, including differences in the use of emotion-expressive words.

pdf bib
Topic-Specific Sentiment Analysis Can Help Identify Political Ideology
Sumit Bhatia | Deepak P

Ideological leanings of an individual can often be gauged by the sentiment one expresses about different issues. We propose a simple framework that represents a political ideology as a distribution of sentiment polarities towards a set of topics. This representation can then be used to detect ideological leanings of documents (speeches, news articles, etc.) based on the sentiments expressed towards different topics. Experiments performed using a widely used dataset show the promise of our proposed approach that achieves comparable performance to other methods despite being much simpler and more interpretable.

pdf bib
Saying no but meaning yes: negation and sentiment analysis in Basque
Jon Alkorta | Koldo Gojenola | Mikel Iruskieta

In this work, we have analyzed the effects of negation on the semantic orientation in Basque. The analysis shows that negation markers can strengthen, weaken or have no effect on sentiment orientation of a word or a group of words. Using the Constraint Grammar formalism, we have designed and evaluated a set of linguistic rules to formalize these three phenomena. The results show that two phenomena, strengthening and no change, have been identified accurately and the third one, weakening, with acceptable results.

pdf bib
Leveraging Writing Systems Change for Deep Learning Based Chinese Emotion Analysis
Rong Xiang | Yunfei Long | Qin Lu | Dan Xiong | I-Hsuan Chen

Social media text written in Chinese communities contains mixed scripts including major text written in Chinese, an ideograph-based writing system, and some minor text using Latin letters, an alphabet-based writing system. This phenomenon is called writing systems changes (WSCs). Past studies have shown that WSCs can be used to express emotions, particularly where the social and political environment is more conservative. However, because WSCs can break the syntax of the major text, it poses more challenges in Natural Language Processing (NLP) tasks like emotion classification. In this work, we present a novel deep learning based method to include WSCs as an effective feature for emotion analysis. The method first identifies all WSCs points. Then representation of the major text is learned through an LSTM model whereas the minor text is learned by a separate CNN model. Emotions in the minor text are further highlighted through an attention mechanism before emotion classification. Performance evaluation shows that incorporating WSCs features using deep learning models can improve performance measured by F1-scores compared to the state-of-the-art model.

pdf bib
Ternary Twitter Sentiment Classification with Distant Supervision and Sentiment-Specific Word Embeddings
Mats Byrkjeland | Frederik Gørvell de Lichtenberg | Björn Gambäck

The paper proposes the Ternary Sentiment Embedding Model, a new model for creating sentiment embeddings based on the Hybrid Ranking Model of Tang et al. (2016), but trained on ternary-labeled data instead of binary-labeled, utilizing sentiment embeddings from datasets made with different distant supervision methods. The model is used as part of a complete Twitter Sentiment Analysis system and empirically compared to existing systems, showing that it outperforms Hybrid Ranking and that the quality of the distant-supervised dataset has a great impact on the quality of the produced sentiment embeddings.

pdf bib
Linking News Sentiment to Microblogs: A Distributional Semantics Approach to Enhance Microblog Sentiment Classification
Tobias Daudert | Paul Buitelaar

Social media’s popularity in society and research is gaining momentum and simultaneously increasing the importance of short textual content such as microblogs. Microblogs are affected by many factors including the news media, therefore, we exploit sentiments conveyed from news to detect and classify sentiment in microblogs. Given that texts can deal with the same entity but might not be vastly related when it comes to sentiment, it becomes necessary to introduce further measures ensuring the relatedness of texts while leveraging the contained sentiments. This paper describes ongoing research introducing distributional semantics to improve the exploitation of news-contained sentiment to enhance microblog sentiment classification.

pdf bib
Aspect Based Sentiment Analysis into the Wild
Caroline Brun | Vassilina Nikoulina

In this paper, we test state-of-the-art Aspect Based Sentiment Analysis (ABSA) systems trained on a widely used dataset on actual data. We created a new manually annotated dataset of user generated data from the same domain as the training dataset, but from other sources and analyse the differences between the new and the standard ABSA dataset. We then analyse the results in performance of different versions of the same system on both datasets. We also propose light adaptation methods to increase system robustness.

pdf bib
The Role of Emotions in Native Language Identification
Ilia Markov | Vivi Nastase | Carlo Strapparava | Grigori Sidorov

We explore the hypothesis that emotion is one of the dimensions of language that surfaces from the native language into a second language. To check the role of emotions in native language identification (NLI), we model emotion information through polarity and emotion load features, and use document representations using these features to classify the native language of the author. The results indicate that emotion is relevant for NLI, even for high proficiency levels and across topics.

pdf bib
Self-Attention: A Better Building Block for Sentiment Analysis Neural Network Classifiers
Artaches Ambartsoumian | Fred Popowich

Sentiment Analysis has seen much progress in the past two decades. For the past few years, neural network approaches, primarily RNNs and CNNs, have been the most successful for this task. Recently, a new category of neural networks, self-attention networks (SANs), have been created which utilizes the attention mechanism as the basic building block. Self-attention networks have been shown to be effective for sequence modeling tasks, while having no recurrence or convolutions. In this work we explore the effectiveness of the SANs for sentiment analysis. We demonstrate that SANs are superior in performance to their RNN and CNN counterparts by comparing their classification accuracy on six datasets as well as their model characteristics such as training speed and memory consumption. Finally, we explore the effects of various SAN modifications such as multi-head attention as well as two methods of incorporating sequence position information into SANs.

pdf bib
Dual Memory Network Model for Biased Product Review Classification
Yunfei Long | Mingyu Ma | Qin Lu | Rong Xiang | Chu-Ren Huang

In sentiment analysis (SA) of product reviews, both user and product information are proven to be useful. Current tasks handle user profile and product information in a unified model which may not be able to learn salient features of users and products effectively. In this work, we propose a dual user and product memory network (DUPMN) model to learn user profiles and product reviews using separate memory networks. Then, the two representations are used jointly for sentiment prediction. The use of separate models aims to capture user profiles and product information more effectively. Compared to state-of-the-art unified prediction models, the evaluations on three benchmark datasets, IMDB, Yelp13, and Yelp14, show that our dual learning model gives performance gain of 0.6%, 1.2%, and 0.9%, respectively. The improvements are also deemed very significant measured by p-values.

pdf bib
Measuring Issue Ownership using Word Embeddings
Amaru Cuba Gyllensten | Magnus Sahlgren

Sentiment and topic analysis are common methods used for social media monitoring. Essentially, these methods answers questions such as, “what is being talked about, regarding X”, and “what do people feel, regarding X”. In this paper, we investigate another venue for social media monitoring, namely issue ownership and agenda setting, which are concepts from political science that have been used to explain voter choice and electoral outcomes. We argue that issue alignment and agenda setting can be seen as a kind of semantic source similarity of the kind “how similar is source A to issue owner P, when talking about issue X”, and as such can be measured using word/document embedding techniques. We present work in progress towards measuring that kind of conditioned similarity, and introduce a new notion of similarity for predictive embeddings. We then test this method by measuring the similarity between politically aligned media and political parties, conditioned on bloc-specific issues.

pdf bib
Sentiment Expression Boundaries in Sentiment Polarity Classification
Rasoul Kaljahi | Jennifer Foster

We investigate the effect of using sentiment expression boundaries in predicting sentiment polarity in aspect-level sentiment analysis. We manually annotate a freely available English sentiment polarity dataset with these boundaries and carry out a series of experiments which demonstrate that high quality sentiment expressions can boost the performance of polarity classification. Our experiments with neural architectures also show that CNN networks outperform LSTMs on this task and dataset.

pdf bib
Exploring and Learning Suicidal Ideation Connotations on Social Media with Deep Learning
Ramit Sawhney | Prachi Manchanda | Puneet Mathur | Rajiv Shah | Raj Singh

The increasing suicide rates amongst youth and its high correlation with suicidal ideation expression on social media warrants a deeper investigation into models for the detection of suicidal intent in text such as tweets to enable prevention. However, the complexity of the natural language constructs makes this task very challenging. Deep Learning architectures such as LSTMs, CNNs, and RNNs show promise in sentence level classification problems. This work investigates the ability of deep learning architectures to build an accurate and robust model for suicidal ideation detection and compares their performance with standard baselines in text classification problems. The experimental results reveal the merit in C-LSTM based models as compared to other deep learning and machine learning based classification models for suicidal ideation detection.

pdf bib
UTFPR at IEST 2018: Exploring Character-to-Word Composition for Emotion Analysis
Gustavo Paetzold

We introduce the UTFPR system for the Implicit Emotions Shared Task of 2018: A compositional character-to-word recurrent neural network that does not exploit heavy and/or hard-to-obtain resources. We find that our approach can outperform multiple baselines, and offers an elegant and effective solution to the problem of orthographic variance in tweets.

pdf bib
HUMIR at IEST-2018: Lexicon-Sensitive and Left-Right Context-Sensitive BiLSTM for Implicit Emotion Recognition
Behzad Naderalvojoud | Alaettin Ucan | Ebru Akcapinar Sezer

This paper describes the approaches used in HUMIR system for the WASSA-2018 shared task on the implicit emotion recognition. The objective of this task is to predict the emotion expressed by the target word that has been excluded from the given tweet. We suppose this task as a word sense disambiguation in which the target word is considered as a synthetic word that can express 6 emotions depending on the context. To predict the correct emotion, we propose a deep neural network model that uses two BiLSTM networks to represent the contexts in the left and right sides of the target word. The BiLSTM outputs achieved from the left and right contexts are considered as context-sensitive features. These features are used in a feed-forward neural network to predict the target word emotion. Besides this approach, we also combine the BiLSTM model with lexicon-based and emotion-based features. Finally, we employ all models in the final system using Bagging ensemble method. We achieved macro F-measure value of 68.8 on the official test set and ranked sixth out of 30 participants.

pdf bib
NLP at IEST 2018: BiLSTM-Attention and LSTM-Attention via Soft Voting in Emotion Classification
Qimin Zhou | Hao Wu

This paper describes our method that competed at WASSA2018 Implicit Emotion Shared Task. The goal of this task is to classify the emotions of excluded words in tweets into six different classes: sad, joy, disgust, surprise, anger and fear. For this, we examine a BiLSTM architecture with attention mechanism (BiLSTM-Attention) and a LSTM architecture with attention mechanism (LSTM-Attention), and try different dropout rates based on these two models. We then exploit an ensemble of these methods to give the final prediction which improves the model performance significantly compared with the baseline model. The proposed method achieves 7th position out of 30 teams and outperforms the baseline method by 12.5% in terms of macro F1.

pdf bib
SINAI at IEST 2018: Neural Encoding of Emotional External Knowledge for Emotion Classification
Flor Miriam Plaza-del-Arco | Eugenio Martínez-Cámara | Maite Martin | L. Alfonso Ureña- López

In this paper, we describe our participation in WASSA 2018 Implicit Emotion Shared Task (IEST 2018). We claim that the use of emotional external knowledge may enhance the performance and the capacity of generalization of an emotion classification system based on neural networks. Accordingly, we submitted four deep learning systems grounded in a sequence encoding layer. They mainly differ in the feature vector space and the recurrent neural network used in the sequence encoding layer. The official results show that the systems that used emotional external knowledge have a higher capacity of generalization, hence our claim holds.

pdf bib
EmoNLP at IEST 2018: An Ensemble of Deep Learning Models and Gradient Boosting Regression Tree for Implicit Emotion Prediction in Tweets
Man Liu

This paper describes our system submitted to IEST 2018, a shared task (Klinger et al., 2018) to predict the emotion types. Six emotion types are involved: anger, joy, fear, surprise, disgust and sad. We perform three different approaches: feed forward neural network (FFNN), convolutional BLSTM (ConBLSTM) and Gradient Boosting Regression Tree Method (GBM). Word embeddings used in convolutional BLSTM are pre-trained on 470 million tweets which are filtered using the emotional words and emojis. In addition, broad sets of features (i.e. syntactic features, lexicon features, cluster features) are adopted to train GBM and FFNN. The three approaches are finally ensembled by the weighted average of predicted probabilities of each emotion label.

pdf bib
HGSGNLP at IEST 2018: An Ensemble of Machine Learning and Deep Neural Architectures for Implicit Emotion Classification in Tweets
Wenting Wang

This paper describes our system designed for the WASSA-2018 Implicit Emotion Shared Task (IEST). The task is to predict the emotion category expressed in a tweet by removing the terms angry, afraid, happy, sad, surprised, disgusted and their synonyms. Our final submission is an ensemble of one supervised learning model and three deep neural network based models, where each model approaches the problem from essentially different directions. Our system achieves the macro F1 score of 65.8%, which is a 5.9% performance improvement over the baseline and is ranked 12 out of 30 participating teams.

pdf bib
DataSEARCH at IEST 2018: Multiple Word Embedding based Models for Implicit Emotion Classification of Tweets with Deep Learning
Yasas Senarath | Uthayasanker Thayasivam

This paper describes an approach to solve implicit emotion classification with the use of pre-trained word embedding models to train multiple neural networks. The system described in this paper is composed of a sequential combination of Long Short-Term Memory and Convolutional Neural Network for feature extraction and Feedforward Neural Network for classification. In this paper, we successfully show that features extracted using multiple pre-trained embeddings can be used to improve the overall performance of the system with Emoji being one of the significant features. The evaluations show that our approach outperforms the baseline system by more than 8% without using any external corpus or lexicon. This approach is ranked 8th in Implicit Emotion Shared Task (IEST) at WASSA-2018.

pdf bib
NL-FIIT at IEST-2018: Emotion Recognition utilizing Neural Networks and Multi-level Preprocessing
Samuel Pecar | Michal Farkas | Marian Simko | Peter Lacko | Maria Bielikova

In this paper, we present neural models submitted to Shared Task on Implicit Emotion Recognition, organized as part of WASSA 2018. We propose a Bi-LSTM architecture with regularization through dropout and Gaussian noise. Our models use three different embedding layers: GloVe word embeddings trained on Twitter dataset, ELMo embeddings and also sentence embeddings. We see preprocessing as one of the most important parts of the task. We focused on handling emojis, emoticons, hashtags, and also various shortened word forms. In some cases, we proposed to remove some parts of the text, as they do not affect emotion of the original sentence. We also experimented with other modifications like category weights for learning and stacking multiple layers. Our model achieved a macro average F1 score of 65.55%, significantly outperforming the baseline model produced by a simple logistic regression.

pdf bib
UWB at IEST 2018: Emotion Prediction in Tweets with Bidirectional Long Short-Term Memory Neural Network
Pavel Přibáň | Jiří Martínek

This paper describes our system created for the WASSA 2018 Implicit Emotion Shared Task. The goal of this task is to predict the emotion of a given tweet, from which a certain emotion word is removed. The removed word can be sad, happy, disgusted, angry, afraid or a synonym of one of them. Our proposed system is based on deep-learning methods. We use Bidirectional Long Short-Term Memory (BiLSTM) with word embeddings as an input. Pre-trained DeepMoji model and pre-trained emoji2vec emoji embeddings are also used as additional inputs. Our System achieves 0.657 macro F1 score and our rank is 13th out of 30.

pdf bib
USI-IR at IEST 2018: Sequence Modeling and Pseudo-Relevance Feedback for Implicit Emotion Detection
Esteban Ríssola | Anastasia Giachanou | Fabio Crestani

This paper describes the participation of USI-IR in WASSA 2018 Implicit Emotion Shared Task. We propose a relevance feedback approach employing a sequential model (biLSTM) and word embeddings derived from a large collection of tweets. To this end, we assume that the top-k predictions produce at a first classification step are correct (based on the model accuracy) and use them as new examples to re-train the network.

pdf bib
EmotiKLUE at IEST 2018: Topic-Informed Classification of Implicit Emotions
Thomas Proisl | Philipp Heinrich | Besim Kabashi | Stefan Evert

EmotiKLUE is a submission to the Implicit Emotion Shared Task. It is a deep learning system that combines independent representations of the left and right contexts of the emotion word with the topic distribution of an LDA topic model. EmotiKLUE achieves a macro average F₁score of 67.13%, significantly outperforming the baseline produced by a simple ML classifier. Further enhancements after the evaluation period lead to an improved F₁score of 68.10%.

pdf bib
BrainT at IEST 2018: Fine-tuning Multiclass Perceptron For Implicit Emotion Classification
Vachagan Gratian | Marina Haid

We present BrainT, a multi-class, averaged perceptron tested on implicit emotion prediction of tweets. We show that the dataset is linearly separable and explore ways in fine-tuning the baseline classifier. Our results indicate that the bag-of-words features benefit the model moderately and prediction can be improved with bigrams, trigrams, skip-one-tetragrams and POS-tags. Furthermore, we find preprocessing of the n-grams, including stemming, lowercasing, stopword filtering, emoji and emoticon conversion generally not useful. The model is trained on an annotated corpus of 153,383 tweets and predictions on the test data were submitted to the WASSA-2018 Implicit Emotion Shared Task. BrainT attained a Macro F-score of 0.63.

pdf bib
Disney at IEST 2018: Predicting Emotions using an Ensemble
Wojciech Witon | Pierre Colombo | Ashutosh Modi | Mubbasir Kapadia

This paper describes our participating system in the WASSA 2018 shared task on emotion prediction. The task focuses on implicit emotion prediction in a tweet. In this task, keywords corresponding to the six emotion labels used (anger, fear, disgust, joy, sad, and surprise) have been removed from the tweet text, making emotion prediction implicit and the task challenging. We propose a model based on an ensemble of classifiers for prediction. Each classifier uses a sequence of Convolutional Neural Network (CNN) architecture blocks and uses ELMo (Embeddings from Language Model) as an input. Our system achieves a 66.2% F1 score on the test set. The best performing system in the shared task has reported a 71.4% F1 score.

pdf bib
Sentylic at IEST 2018: Gated Recurrent Neural Network and Capsule Network Based Approach for Implicit Emotion Detection
Prabod Rathnayaka | Supun Abeysinghe | Chamod Samarajeewa | Isura Manchanayake | Malaka Walpola

In this paper, we present the system we have used for the Implicit WASSA 2018 Implicit Emotion Shared Task. The task is to predict the emotion of a tweet of which the explicit mentions of emotion terms have been removed. The idea is to come up with a model which has the ability to implicitly identify the emotion expressed given the context words. We have used a Gated Recurrent Neural Network (GRU) and a Capsule Network based model for the task. Pre-trained word embeddings have been utilized to incorporate contextual knowledge about words into the model. GRU layer learns latent representations using the input word embeddings. Subsequent Capsule Network layer learns high-level features from that hidden representation. The proposed model managed to achieve a macro-F1 score of 0.692.

pdf bib
Fast Approach to Build an Automatic Sentiment Annotator for Legal Domain using Transfer Learning
Viraj Salaka | Menuka Warushavithana | Nisansa de Silva | Amal Shehan Perera | Gathika Ratnayaka | Thejan Rupasinghe

This study proposes a novel way of identifying the sentiment of the phrases used in the legal domain. The added complexity of the language used in law, and the inability of the existing systems to accurately predict the sentiments of words in law are the main motivations behind this study. This is a transfer learning approach which can be used for other domain adaptation tasks as well. The proposed methodology achieves an improvement of over 6% compared to the source model’s accuracy in the legal domain.

pdf bib
What Makes You Stressed? Finding Reasons From Tweets
Reshmi Gopalakrishna Pillai | Mike Thelwall | Constantin Orasan

Detecting stress from social media gives a non-intrusive and inexpensive alternative to traditional tools such as questionnaires or physiological sensors for monitoring mental state of individuals. This paper introduces a novel framework for finding reasons for stress from tweets, analyzing multiple categories for the first time. Three word-vector based methods are evaluated on collections of tweets about politics or airlines and are found to be more accurate than standard machine learning algorithms.

pdf bib
EmojiGAN: learning emojis distributions with a generative model
Bogdan Mazoure | Thang Doan | Saibal Ray

Generative models have recently experienced a surge in popularity due to the development of more efficient training algorithms and increasing computational power. Models such as adversarial generative networks (GANs) have been successfully used in various areas such as computer vision, medical imaging, style transfer and natural language generation. Adversarial nets were recently shown to yield results in the image-to-text task, where given a set of images, one has to provide their corresponding text description. In this paper, we take a similar approach and propose a image-to-emoji architecture, which is trained on data from social networks and can be used to score a given picture using ideograms. We show empirical results of our algorithm on data obtained from the most influential Instagram accounts.

pdf bib
Identifying Opinion-Topics and Polarity of Parliamentary Debate Motions
Gavin Abercrombie | Riza Theresa Batista-Navarro

Analysis of the topics mentioned and opinions expressed in parliamentary debate motions–or proposals–is difficult for human readers, but necessary for understanding and automatic processing of the content of the subsequent speeches. We present a dataset of debate motions with pre-existing ‘policy’ labels, and investigate the utility of these labels for simultaneous topic and opinion polarity analysis. For topic detection, we apply one-versus-the-rest supervised topic classification, finding that good performance is achieved in predicting the policy topics, and that textual features derived from the debate titles associated with the motions are particularly indicative of motion topic. We then examine whether the output could also be used to determine the positions taken by proposers towards the different policies by investigating how well humans agree in interpreting the opinion polarities of the motions. Finding very high levels of agreement, we conclude that the policies used can be reliable labels for use in these tasks, and that successful topic detection can therefore provide opinion analysis of the motions ‘for free’.

pdf bib
Homonym Detection For Humor Recognition In Short Text
Sven van den Beukel | Lora Aroyo

In this paper, automatic homophone- and homograph detection are suggested as new useful features for humor recognition systems. The system combines style-features from previous studies on humor recognition in short text with ambiguity-based features. The performance of two potentially useful homograph detection methods is evaluated using crowdsourced annotations as ground truth. Adding homophones and homographs as features to the classifier results in a small but significant improvement over the style-features alone. For the task of humor recognition, recall appears to be a more important quality measure than precision. Although the system was designed for humor recognition in oneliners, it also performs well at the classification of longer humorous texts.

pdf bib
Emo2Vec: Learning Generalized Emotion Representation by Multi-task Training
Peng Xu | Andrea Madotto | Chien-Sheng Wu | Ji Ho Park | Pascale Fung

In this paper, we propose Emo2Vec which encodes emotional semantics into vectors. We train Emo2Vec by multi-task learning six different emotion-related tasks, including emotion/sentiment analysis, sarcasm classification, stress detection, abusive language classification, insult detection, and personality recognition. Our evaluation of Emo2Vec shows that it outperforms existing affect-related representations, such as Sentiment-Specific Word Embedding and DeepMoji embeddings with much smaller training corpora. When concatenated with GloVe, Emo2Vec achieves competitive performances to state-of-the-art results on several tasks using a simple logistic regression classifier.

pdf bib
Learning representations for sentiment classification using Multi-task framework
Hardik Meisheri | Harshad Khadilkar

Most of the existing state of the art sentiment classification techniques involve the use of pre-trained embeddings. This paper postulates a generalized representation that collates training on multiple datasets using a Multi-task learning framework. We incorporate publicly available, pre-trained embeddings with Bidirectional LSTM’s to develop the multi-task model. We validate the representations on an independent test Irony dataset that can contain several sentiments within each sample, with an arbitrary distribution. Our experiments show a significant improvement in results as compared to the available baselines for individual datasets on which independent models are trained. Results also suggest superior performance of the representations generated over Irony dataset.

pdf bib
Super Characters: A Conversion from Sentiment Classification to Image Classification
Baohua Sun | Lin Yang | Patrick Dong | Wenhan Zhang | Jason Dong | Charles Young

We propose a method named Super Characters for sentiment classification. This method converts the sentiment classification problem into image classification problem by projecting texts into images and then applying CNN models for classification. Text features are extracted automatically from the generated Super Characters images, hence there is no need of any explicit step of embedding the words or characters into numerical vector representations. Experimental results on large social media corpus show that the Super Characters method consistently outperforms other methods for sentiment classification and topic classification tasks on ten large social media datasets of millions of contents in four different languages, including Chinese, Japanese, Korean and English.

pdf bib
Learning Comment Controversy Prediction in Web Discussions Using Incidentally Supervised Multi-Task CNNs
Nils Rethmeier | Marc Hübner | Leonhard Hennig

Comments on web news contain controversies that manifest as inter-group agreement-conflicts. Tracking such rapidly evolving controversy could ease conflict resolution or journalist-user interaction. However, this presupposes controversy online-prediction that scales to diverse domains using incidental supervision signals instead of manual labeling. To more deeply interpret comment-controversy model decisions we frame prediction as binary classification and evaluate baselines and multi-task CNNs that use an auxiliary news-genre-encoder. Finally, we use ablation and interpretability methods to determine the impacts of topic, discourse and sentiment indicators, contextual vs. global word influence, as well as genre-keywords vs. per-genre-controversy keywords – to find that the models learn plausible controversy features using only incidentally supervised signals.

pdf bib
Words Worth: Verbal Content and Hirability Impressions in YouTube Video Resumes
Skanda Muralidhar | Laurent Nguyen | Daniel Gatica-Perez

Automatic hirability prediction from video resumes is gaining increasing attention in both psychology and computing. Most existing works have investigated hirability from the perspective of nonverbal behavior, with verbal content receiving little interest. In this study, we leverage the advances in deep-learning based text representation techniques (like word embedding) in natural language processing to investigate the relationship between verbal content and perceived hirability ratings. To this end, we use 292 conversational video resumes from YouTube, develop a computational framework to automatically extract various representations of verbal content, and evaluate them in a regression task. We obtain a best performance of R² = 0.23 using GloVe, and R² = 0.22 using Word2Vec representations for manual and automatically transcribed texts respectively. Our inference results indicate the feasibility of using deep learning based verbal content representation in inferring hirability scores from online conversational video resumes.

pdf bib
Predicting Adolescents’ Educational Track from Chat Messages on Dutch Social Media
Lisa Hilte | Walter Daelemans | Reinhild Vandekerckhove

We aim to predict Flemish adolescents’ educational track based on their Dutch social media writing. We distinguish between the three main types of Belgian secondary education: General (theory-oriented), Vocational (practice-oriented), and Technical Secondary Education (hybrid). The best results are obtained with a Naive Bayes model, i.e. an F-score of 0.68 (std. dev. 0.05) in 10-fold cross-validation experiments on the training data and an F-score of 0.60 on unseen data. Many of the most informative features are character n-grams containing specific occurrences of chatspeak phenomena such as emoticons. While the detection of the most theory- and practice-oriented educational tracks seems to be a relatively easy task, the hybrid Technical level appears to be much harder to capture based on online writing style, as expected.

pdf bib
Arabizi sentiment analysis based on transliteration and automatic corpus annotation
Imane Guellil | Ahsan Adeel | Faical Azouaou | Fodil Benali | Ala-eddine Hachani | Amir Hussain

Arabizi is a form of writing Arabic text which relies on Latin letters, numerals and punctuation rather than Arabic letters. In the literature, the difficulties associated with Arabizi sentiment analysis have been underestimated, principally due to the complexity of Arabizi. In this paper, we present an approach to automatically classify sentiments of Arabizi messages into positives or negatives. In the proposed approach, Arabizi messages are first transliterated into Arabic. Afterwards, we automatically classify the sentiment of the transliterated corpus using an automatically annotated corpus. For corpus validation, shallow machine learning algorithms such as Support Vectors Machine (SVM) and Naive Bays (NB) are used. Simulations results demonstrate the outperformance of NB algorithm over all others. The highest achieved F1-score is up to 78% and 76% for manually and automatically transliterated dataset respectively. Ongoing work is aimed at improving the transliterator module and annotated sentiment dataset.

pdf bib
UBC-NLP at IEST 2018: Learning Implicit Emotion With an Ensemble of Language Models
Hassan Alhuzali | Mohamed Elaraby | Muhammad Abdul-Mageed

We describe UBC-NLP contribution to IEST-2018, focused at learning implicit emotion in Twitter data. Among the 30 participating teams, our system ranked the 4th (with 69.3% F-score). Post competition, we were able to score slightly higher than the 3rd ranking system (reaching 70.7%). Our system is trained on top of a pre-trained language model (LM), fine-tuned on the data provided by the task organizers. Our best results are acquired by an average of an ensemble of language models. We also offer an analysis of system performance and the impact of training data size on the task. For example, we show that training our best model for only one epoch with < 40% of the data enables better performance than the baseline reported by Klinger et al. (2018) for the task.