Cynthia Van Hee

Also published as: Cynthia van Hee


2024

pdf bib
Human and System Perspectives on the Expression of Irony: An Analysis of Likelihood Labels and Rationales
Aaron Maladry | Alessandra Teresa Cignarella | Els Lefever | Cynthia van Hee | Veronique Hoste
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

In this paper, we examine the recognition of irony by both humans and automatic systems. We achieve this by enhancing the annotations of an English benchmark data set for irony detection. This enhancement involves a layer of human-annotated irony likelihood using a 7-point Likert scale that combines binary annotation with a confidence measure. Additionally, the annotators indicated the trigger words that led them to perceive the text as ironic, which leveraged necessary theoretical insights into the definition of irony and its various forms. By comparing these trigger word spans across annotators, we determine the extent to which humans agree on the source of irony in a text. Finally, we compare the human-annotated spans with sub-token importance attributions for fine-tuned transformers using Layer Integrated Gradients, a state-of-the-art interpretability metric. Our results indicate that our model achieves better performance on tweets that were annotated with high confidence and high agreement. Although automatic systems can identify trigger words with relative success, they still attribute a significant amount of their importance to the wrong tokens.

2023

pdf bib
A Fine Line Between Irony and Sincerity: Identifying Bias in Transformer Models for Irony Detection
Aaron Maladry | Els Lefever | Cynthia Van Hee | Veronique Hoste
Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis

In this paper we investigate potential bias in fine-tuned transformer models for irony detection. Bias is defined in this research as spurious associations between word n-grams and class labels, that can cause the system to rely too much on superficial cues and miss the essence of the irony. For this purpose, we looked for correlations between class labels and words that are prone to trigger irony, such as positive adjectives, intensifiers and topical nouns. Additionally, we investigate our irony model’s predictions before and after manipulating the data set through irony trigger replacements. We further support these insights with state-of-the-art explainability techniques (Layer Integrated Gradients, Discretized Integrated Gradients and Layer-wise Relevance Propagation). Both approaches confirm the hypothesis that transformer models generally encode correlations between positive sentiments and ironic texts, with even higher correlations between vividly expressed sentiment and irony. Based on these insights, we implemented a number of modification strategies to enhance the robustness of our irony classifier.

2022

pdf bib
SentEMO: A Multilingual Adaptive Platform for Aspect-based Sentiment and Emotion Analysis
Ellen De Geyndt | Orphee De Clercq | Cynthia Van Hee | Els Lefever | Pranaydeep Singh | Olivier Parent | Veronique Hoste
Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis

In this paper, we present the SentEMO platform, a tool that provides aspect-based sentiment analysis and emotion detection of unstructured text data such as reviews, emails and customer care conversations. Currently, models have been trained for five domains and one general domain and are implemented in a pipeline approach, where the output of one model serves as the input for the next. The results are presented in three dashboards, allowing companies to gain more insights into what stakeholders think of their products and services. The SentEMO platform is available at https://sentemo.ugent.be

pdf bib
Irony Detection for Dutch: a Venture into the Implicit
Aaron Maladry | Els Lefever | Cynthia Van Hee | Veronique Hoste
Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis

This paper presents the results of a replication experiment for automatic irony detection in Dutch social media text, investigating both a feature-based SVM classifier, as was done by Van Hee et al. (2017) and and a transformer-based approach. In addition to building a baseline model, an important goal of this research is to explore the implementation of common-sense knowledge in the form of implicit sentiment, as we strongly believe that common-sense and connotative knowledge are essential to the identification of irony and implicit meaning in tweets. We show promising results and the presented approach can provide a solid baseline and serve as a staging ground to build on in future experiments for irony detection in Dutch.

2021

pdf bib
Exploring Implicit Sentiment Evoked by Fine-grained News Events
Cynthia Van Hee | Orphee De Clercq | Veronique Hoste
Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

We investigate the feasibility of defining sentiment evoked by fine-grained news events. Our research question is based on the premise that methods for detecting implicit sentiment in news can be a key driver of content diversity, which is one way to mitigate the detrimental effects of filter bubbles that recommenders based on collaborative filtering may produce. Our experiments are based on 1,735 news articles from major Flemish newspapers that were manually annotated, with high agreement, for implicit sentiment. While lexical resources prove insufficient for sentiment analysis in this data genre, our results demonstrate that machine learning models based on SVM and BERT are able to automatically infer the implicit sentiment evoked by news events.

2018

pdf bib
We Usually Don’t Like Going to the Dentist: Using Common Sense to Detect Irony on Twitter
Cynthia Van Hee | Els Lefever | Véronique Hoste
Computational Linguistics, Volume 44, Issue 4 - December 2018

Although common sense and connotative knowledge come naturally to most people, computers still struggle to perform well on tasks for which such extratextual information is required. Automatic approaches to sentiment analysis and irony detection have revealed that the lack of such world knowledge undermines classification performance. In this article, we therefore address the challenge of modeling implicit or prototypical sentiment in the framework of automatic irony detection. Starting from manually annotated connoted situation phrases (e.g., “flight delays,” “sitting the whole day at the doctor’s office”), we defined the implicit sentiment held towards such situations automatically by using both a lexico-semantic knowledge base and a data-driven method. We further investigate how such implicit sentiment information affects irony detection by assessing a state-of-the-art irony classifier before and after it is informed with implicit sentiment information.

pdf bib
SemEval-2018 Task 3: Irony Detection in English Tweets
Cynthia Van Hee | Els Lefever | Véronique Hoste
Proceedings of the 12th International Workshop on Semantic Evaluation

This paper presents the first shared task on irony detection: given a tweet, automatic natural language processing systems should determine whether the tweet is ironic (Task A) and which type of irony (if any) is expressed (Task B). The ironic tweets were collected using irony-related hashtags (i.e. #irony, #sarcasm, #not) and were subsequently manually annotated to minimise the amount of noise in the corpus. Prior to distributing the data, hashtags that were used to collect the tweets were removed from the corpus. For both tasks, a training corpus of 3,834 tweets was provided, as well as a test set containing 784 tweets. Our shared tasks received submissions from 43 teams for the binary classification Task A and from 31 teams for the multiclass Task B. The highest classification scores obtained for both subtasks are respectively F1= 0.71 and F1= 0.51 and demonstrate that fine-grained irony classification is much more challenging than binary irony detection.

2017

pdf bib
Noise or music? Investigating the usefulness of normalisation for robust sentiment analysis on social media data
Cynthia Van Hee | Marjan Van de Kauter | Orphée De Clercq | Els Lefever | Bart Desmet | Véronique Hoste
Traitement Automatique des Langues, Volume 58, Numéro 1 : Varia [Varia]

2016

pdf bib
Exploring the Realization of Irony in Twitter Data
Cynthia Van Hee | Els Lefever | Véronique Hoste
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Handling figurative language like irony is currently a challenging task in natural language processing. Since irony is commonly used in user-generated content, its presence can significantly undermine accurate analysis of opinions and sentiment in such texts. Understanding irony is therefore important if we want to push the state-of-the-art in tasks such as sentiment analysis. In this research, we present the construction of a Twitter dataset for two languages, being English and Dutch, and the development of new guidelines for the annotation of verbal irony in social media texts. Furthermore, we present some statistics on the annotated corpora, from which we can conclude that the detection of contrasting evaluations might be a good indicator for recognizing irony.

pdf bib
Monday mornings are my fave :) #not Exploring the Automatic Recognition of Irony in English tweets
Cynthia Van Hee | Els Lefever | Véronique Hoste
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Recognising and understanding irony is crucial for the improvement natural language processing tasks including sentiment analysis. In this study, we describe the construction of an English Twitter corpus and its annotation for irony based on a newly developed fine-grained annotation scheme. We also explore the feasibility of automatic irony recognition by exploiting a varied set of features including lexical, syntactic, sentiment and semantic (Word2Vec) information. Experiments on a held-out test set show that our irony classifier benefits from this combined information, yielding an F1-score of 67.66%. When explicit hashtag information like #irony is included in the data, the system even obtains an F1-score of 92.77%. A qualitative analysis of the output reveals that recognising irony that results from a polarity clash appears to be (much) more feasible than recognising other forms of ironic utterances (e.g., descriptions of situational irony).

2015

pdf bib
LT3: Sentiment Analysis of Figurative Tweets: piece of cake #NotReally
Cynthia Van Hee | Els Lefever | Véronique Hoste
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
Detection and Fine-Grained Classification of Cyberbullying Events
Cynthia Van Hee | Els Lefever | Ben Verhoeven | Julie Mennes | Bart Desmet | Guy De Pauw | Walter Daelemans | Veronique Hoste
Proceedings of the International Conference Recent Advances in Natural Language Processing

2014

pdf bib
LT3: Sentiment Classification in User-Generated Content Using a Rich Feature Set
Cynthia Van Hee | Marjan Van de Kauter | Orphée De Clercq | Els Lefever | Véronique Hoste
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)