Orphee De Clercq

Also published as: Orphée De Clercq, Orphee de Clercq

2026

Lost in Activations: A Neuron-level Analysis of Encoders for Cross-Lingual Emotion Detection
Pranaydeep Singh | Orphee De Clercq | Els Lefever
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers)

The rapid advancement of multilingual pre-trained transformers has fueled significant progress in natural language understanding across diverse languages. Yet, their inner workings remain opaque, especially with regard to how individual neurons encode and generalize semantic and affective features across languages. This paper presents an interpretability study of a fine-tuned XLM-R model for multilingual emotion classification. Using neuron-level activation analysis, we investigate the variance of neurons across labels, cross-lingual alignment of activations, and the existence of “polyglot” versus language-specific neurons. Our results reveal that while certain neurons consistently encode emotion-related concepts across languages, others show strong monolingual specialization.

pdf bib

The Proceedings for the 15th Workshop on Computational Approaches to Subjectivity, Sentiment Social Media Analysis (WASSA 2026)
Jeremy Barnes | Valentin Barriere | Orphée De Clercq | Roman Klinger | Célia Nouri | Debora Nozza | Pranaydeep Singh
The Proceedings for the 15th Workshop on Computational Approaches to Subjectivity, Sentiment Social Media Analysis (WASSA 2026)

2025

pdf bib abs

EnerGIZAr: Leveraging GIZA++ for Effective Tokenizer Initialization
Pranaydeep Singh | Eneko Agirre | Gorka Azkune | Orphee De Clercq | Els Lefever
Findings of the Association for Computational Linguistics: ACL 2025

Continual pre-training has long been considered the default strategy for adapting models to non-English languages, but struggles with initializing new embeddings, particularly for non-Latin scripts. In this work, we propose EnerGIZAr, a novel methodology that improves continual pre-training by leveraging statistical word alignment techniques. Our approach utilizes GIZA++ to construct a subword-level alignment matrix between source (English) and target language tokens. This matrix enables informed initialization of target tokenizer embeddings, which provides a more effective starting point for adaptation. We evaluate EnerGIZAr against state-of-the-art initialization strategies such as OFA and FOCUS across four typologically diverse languages: Hindi, Basque, Arabic and Korean. Experimental results on key NLP tasks – including POS tagging, Sentiment Analysis, NLI, and NER – demonstrate that EnerGIZAr achieves superior monolingual performance while also out-performing all methods for cross-lingual transfer when tested on XNLI. With EnerGIZAr, we propose an intuitive, explainable as well as state-of-the-art initialisation technique for continual pre-training of English models.

pdf bib

2024

pdf bib

Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations
Nikolaos Aletras | Orphee De Clercq
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

pdf bib abs

Enhancing Unrestricted Cross-Document Event Coreference with Graph Reconstruction Networks
Loic de Langhe | Orphee de Clercq | Veronique Hoste
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Event Coreference Resolution remains a challenging discourse-oriented task within the domain of Natural Language Processing. In this paper we propose a methodology where we combine traditional mention-pair coreference models with a lightweight and modular graph reconstruction algorithm. We show that building graph models on top of existing mention-pair models leads to improved performance for both a wide range of baseline mention-pair algorithms as well as a recently developed state-of-the-art model and this at virtually no added computational cost. Moreover, additional experiments seem to indicate that our method is highly robust in low-data settings and that its performance scales with increases in performance for the underlying mention-pair models.

pdf bib abs

Unsupervised Authorship Attribution for Medieval Latin Using Transformer-Based Embeddings
Loic De Langhe | Orphee De Clercq | Veronique Hoste
Proceedings of the Third Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA) @ LREC-COLING-2024

We explore the potential of employing transformer-based embeddings in an unsupervised authorship attribution task for medieval Latin. The development of Large Language Models (LLMs) and recent advances in transfer learning alleviate many of the traditional issues associated with authorship attribution in lower-resourced (ancient) languages. Despite this, these methods remain heavily understudied within this domain. Concretely, we generate strong contextual embeddings using a variety of mono -and multilingual transformer models and use these as input for two unsupervised clustering methods: a standard agglomerative clustering algorithm and a self-organizing map. We show that these transformer-based embeddings can be used to generate high-quality and interpretable clusterings, resulting in an attractive alternative to the traditional feature-based methods.

pdf bib

Proceedings of the 14th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis
Orphée De Clercq | Valentin Barriere | Jeremy Barnes | Roman Klinger | João Sedoc | Shabnam Tafreshi
Proceedings of the 14th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis

2023

pdf bib abs

Leveraging Structural Discourse Information for Event Coreference Resolution in Dutch
Loic De Langhe | Orphee De Clercq | Veronique Hoste
Proceedings of the 4th Workshop on Computational Approaches to Discourse (CODI 2023)

We directly embed easily extractable discourse structure information (subsection, paragraph and text type) in a transformer-based Dutch event coreference resolution model in order to more explicitly provide it with structural information that is known to be important in coreferential relationships. Results show that integrating this type of knowledge leads to a significant improvement in CONLL F1 for within-document settings (+ 8.6\%) and a minor improvement for cross-document settings (+ 1.1\%).

pdf bib

Filling in the Gaps: Efficient Event Coreference Resolution using Graph Autoencoder Networks
Loic De Langhe | Orphee De Clercq | Veronique Hoste
Proceedings of the Sixth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2023)

pdf bib abs

Misery Loves Complexity: Exploring Linguistic Complexity in the Context of Emotion Detection
Pranaydeep Singh | Luna De Bruyne | Orphée De Clercq | Els Lefever
Findings of the Association for Computational Linguistics: EMNLP 2023

Given the omnipresence of social media in our society, thoughts and opinions are being shared online in an unprecedented manner. This means that both positive and negative emotions can be equally and freely expressed. However, the negativity bias posits that human beings are inherently drawn to and more moved by negativity and, as a consequence, negative emotions get more traffic. Correspondingly, when writing about emotions this negativity bias could lead to expressions of negative emotions that are linguistically more complex. In this paper, we attempt to use readability and linguistic complexity metrics to better understand the manifestation of emotions on social media platforms like Reddit based on the widely-used GoEmotions dataset. We demonstrate that according to most metrics, negative emotions indeed tend to generate more complex text than positive emotions. In addition, we examine whether a higher complexity hampers the automatic identification of emotions. To answer this question, we fine-tuned three state-of-the-art transformers (BERT, RoBERTa, and SpanBERT) on the same emotion detection dataset. We demonstrate that these models often fail to predict emotions for the more complex texts. More advanced LLMs like RoBERTa and SpanBERT also fail to improve by significant margins on complex samples. This calls for a more nuanced interpretation of the emotion detection performance of transformer models. We make the automatically annotated data available for further research at: https://huggingface.co/datasets/pranaydeeps/CAMEO

pdf bib abs

What Does BERT actually Learn about Event Coreference? Probing Structural Information in a Fine-Tuned Dutch Language Model
Loic De Langhe | Orphee De Clercq | Veronique Hoste
Proceedings of the Fourth Workshop on Insights from Negative Results in NLP

We probe structural and discourse aspects of coreferential relationships in a fine-tuned Dutch BERT event coreference model. Previous research has suggested that no such knowledge is encoded in BERT-based models and the classification of coreferential relationships ultimately rests on outward lexical similarity. While we show that BERT can encode a (very) limited number of these discourse aspects (thus disproving assumptions in earlier research), we also note that knowledge of many structural features of coreferential relationships is absent from the encodings generated by the fine-tuned BERT model.

pdf bib

pdf bib abs

A Corpus-Based List of Frequently Used Words in Sesotho
Johannes Sibeko | Orphée De Clercq
Proceedings of the Fourth workshop on Resources for African Indigenous Languages (RAIL 2023)

This paper describes the SpeechReporting Corpus, an online collection of corpora annotated for a range of discourse phenomena. The corpora contain folktales from 7 lesser-studied West African languages. Apart from its value for theoretical linguistics, especially for the study of reported speech, the database is an important resource for the preservation of intangible cultural heritage of minority languages and the development and testing of cross-linguistically applicable computational tools.

pdf bib

Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis
Jeremy Barnes | Orphée De Clercq | Roman Klinger
Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis

2022

pdf bib abs

A Hybrid Knowledge and Transformer-Based Model for Event Detection with Automatic Self-Attention Threshold, Layer and Head Selection
Thierry Desot | Orphee De Clercq | Veronique Hoste
Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE)

Event and argument role detection are frequently conceived as separate tasks. In this work we conceive both processes as one taskin a hybrid event detection approach. Its main component is based on automatic keyword extraction (AKE) using the self-attention mechanism of a BERT transformer model. As a bottleneck for AKE is defining the threshold of the attention values, we propose a novel method for automatic self-attention thresholdselection. It is fueled by core event information, or simply the verb and its arguments as the backbone of an event. These are outputted by a knowledge-based syntactic parser. In a secondstep the event core is enriched with other semantically salient words provided by the transformer model. Furthermore, we propose an automatic self-attention layer and head selectionmechanism, by analyzing which self-attention cells in the BERT transformer contribute most to the hybrid event detection and which linguistic tasks they represent. This approach was integrated in a pipeline event extraction approachand outperforms three state of the art multi-task event extraction methods.

pdf bib abs

Investigating Cross-Document Event Coreference for Dutch
Loic De Langhe | Orphee De Clercq | Veronique Hoste
Proceedings of the Fifth Workshop on Computational Models of Reference, Anaphora and Coreference

In this paper we present baseline results for Event Coreference Resolution (ECR) in Dutch using gold-standard (i.e non-predicted) event mentions. A newly developed benchmark dataset allows us to properly investigate the possibility of creating ECR systems for both within and cross-document coreference. We give an overview of the state of the art for ECR in other languages, as well as a detailed overview of existing ECR resources. Afterwards, we provide a comparative report on our own dataset. We apply a significant number of approaches that have been shown to attain good results for English ECR including feature-based models, monolingual transformer language models and multilingual language models. The best results were obtained using the monolingual BERTje model. Finally, results for all models are thoroughly analysed and visualised, as to provide insight into the inner workings of ECR and long-distance semantic NLP tasks in general.

pdf bib abs

Aspect-Based Emotion Analysis and Multimodal Coreference: A Case Study of Customer Comments on Adidas Instagram Posts
Luna De Bruyne | Akbar Karimi | Orphee De Clercq | Andrea Prati | Veronique Hoste
Proceedings of the Thirteenth Language Resources and Evaluation Conference

While aspect-based sentiment analysis of user-generated content has received a lot of attention in the past years, emotion detection at the aspect level has been relatively unexplored. Moreover, given the rise of more visual content on social media platforms, we want to meet the ever-growing share of multimodal content. In this paper, we present a multimodal dataset for Aspect-Based Emotion Analysis (ABEA). Additionally, we take the first steps in investigating the utility of multimodal coreference resolution in an ABEA framework. The presented dataset consists of 4,900 comments on 175 images and is annotated with aspect and emotion categories and the emotional dimensions of valence and arousal. Our preliminary experiments suggest that ABEA does not benefit from multimodal coreference resolution, and that aspect and emotion classification only requires textual information. However, when more specific information about the aspects is desired, image recognition could be essential.

pdf bib abs

How Language-Dependent is Emotion Detection? Evidence from Multilingual BERT
Luna De Bruyne | Pranaydeep Singh | Orphee De Clercq | Els Lefever | Veronique Hoste
Proceedings of the 2nd Workshop on Multi-lingual Representation Learning (MRL)

As emotion analysis in text has gained a lot of attention in the field of natural language processing, differences in emotion expression across languages could have consequences for how emotion detection models work. We evaluate the language-dependence of an mBERT-based emotion detection model by comparing language identification performance before and after fine-tuning on emotion detection, and performing (adjusted) zero-shot experiments to assess whether emotion detection models rely on language-specific information. When dealing with typologically dissimilar languages, we found evidence for the language-dependence of emotion detection.

pdf bib abs

Investigating the Quality of Static Anchor Embeddings from Transformers for Under-Resourced Languages
Pranaydeep Singh | Orphee De Clercq | Els Lefever
Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages

This paper reports on experiments for cross-lingual transfer using the anchor-based approach of Schuster et al. (2019) for English and a low-resourced language, namely Hindi. For the sake of comparison, we also evaluate the approach on three very different higher-resourced languages, viz. Dutch, Russian and Chinese. Initially designed for ELMo embeddings, we analyze the approach for the more recent BERT family of transformers for a variety of tasks, both mono and cross-lingual. The results largely prove that like most other cross-lingual transfer approaches, the static anchor approach is underwhelming for the low-resource language, while performing adequately for the higher resourced ones. We attempt to provide insights into both the quality of the anchors, and the performance for low-shot cross-lingual transfer to better understand this performance gap. We make the extracted anchors and the modified train and test sets available for future research at https://github.com/pranaydeeps/Vyaapak

pdf bib

Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis
Jeremy Barnes | Orphée De Clercq | Valentin Barriere | Shabnam Tafreshi | Sawsan Alqahtani | João Sedoc | Roman Klinger | Alexandra Balahur
Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis

pdf bib abs

SentEMO: A Multilingual Adaptive Platform for Aspect-based Sentiment and Emotion Analysis
Ellen De Geyndt | Orphee De Clercq | Cynthia Van Hee | Els Lefever | Pranaydeep Singh | Olivier Parent | Veronique Hoste
Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis

In this paper, we present the SentEMO platform, a tool that provides aspect-based sentiment analysis and emotion detection of unstructured text data such as reviews, emails and customer care conversations. Currently, models have been trained for five domains and one general domain and are implemented in a pipeline approach, where the output of one model serves as the input for the next. The results are presented in three dashboards, allowing companies to gain more insights into what stakeholders think of their products and services. The SentEMO platform is available at https://sentemo.ugent.be

2021

pdf bib abs

Event Prominence Extraction Combining a Knowledge-Based Syntactic Parser and a BERT Classifier for Dutch
Thierry Desot | Orphee De Clercq | Veronique Hoste
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

A core task in information extraction is event detection that identifies event triggers in sentences that are typically classified into event types. In this study an event is considered as the unit to measure diversity and similarity in news articles in the framework of a news recommendation system. Current typology-based event detection approaches fail to handle the variety of events expressed in real-world situations. To overcome this, we aim to perform event salience classification and explore whether a transformer model is capable of classifying new information into less and more general prominence classes. After comparing a Support Vector Machine (SVM) baseline and our transformer-based classifier performances on several event span formats, we conceived multi-word event spans as syntactic clauses. Those are fed into our prominence classifier which is fine-tuned on pre-trained Dutch BERT word embeddings. On top of that we outperform a pipeline of a Conditional Random Field (CRF) approach to event-trigger word detection and the BERT-based classifier. To the best of our knowledge we present the first event extraction approach that combines an expert-based syntactic parser with a transformer-based classifier for Dutch.

pdf bib

Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis
Orphee De Clercq | Alexandra Balahur | Joao Sedoc | Valentin Barriere | Shabnam Tafreshi | Sven Buechel | Veronique Hoste
Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

pdf bib abs

WASSA 2021 Shared Task: Predicting Empathy and Emotion in Reaction to News Stories
Shabnam Tafreshi | Orphee De Clercq | Valentin Barriere | Sven Buechel | João Sedoc | Alexandra Balahur
Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

This paper presents the results that were obtained from the WASSA 2021 shared task on predicting empathy and emotions. The participants were given access to a dataset comprising empathic reactions to news stories where harm is done to a person, group, or other. These reactions consist of essays, Batson empathic concern, and personal distress scores, and the dataset was further extended with news articles, person-level demographic information (age, gender, ethnicity, income, education level), and personality information. Additionally, emotion labels, namely Ekman’s six basic emotions, were added to the essays at both the document and sentence level. Participation was encouraged in two tracks: predicting empathy and predicting emotion categories. In total five teams participated in the shared task. We summarize the methods and resources used by the participating teams.

pdf bib abs

Exploring Implicit Sentiment Evoked by Fine-grained News Events
Cynthia Van Hee | Orphee De Clercq | Veronique Hoste
Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

We investigate the feasibility of defining sentiment evoked by fine-grained news events. Our research question is based on the premise that methods for detecting implicit sentiment in news can be a key driver of content diversity, which is one way to mitigate the detrimental effects of filter bubbles that recommenders based on collaborative filtering may produce. Our experiments are based on 1,735 news articles from major Flemish newspapers that were manually annotated, with high agreement, for implicit sentiment. While lexical resources prove insufficient for sentiment analysis in this data genre, our results demonstrate that machine learning models based on SVM and BERT are able to automatically infer the implicit sentiment evoked by news events.

pdf bib abs

Emotional RobBERT and Insensitive BERTje: Combining Transformers and Affect Lexica for Dutch Emotion Detection
Luna De Bruyne | Orphee De Clercq | Veronique Hoste
Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

In a first step towards improving Dutch emotion detection, we try to combine the Dutch transformer models BERTje and RobBERT with lexicon-based methods. We propose two architectures: one in which lexicon information is directly injected into the transformer model and a meta-learning approach where predictions from transformers are combined with lexicon features. The models are tested on 1,000 Dutch tweets and 1,000 captions from TV-shows which have been manually annotated with emotion categories and dimensions. We find that RobBERT clearly outperforms BERTje, but that directly adding lexicon information to transformers does not improve performance. In the meta-learning approach, lexicon information does have a positive effect on BERTje, but not on RobBERT. This suggests that more emotional information is already contained within this latter language model.

2020

pdf bib abs

It’s absolutely divine! Can fine-grained sentiment analysis benefit from coreference resolution?
Orphee De Clercq | Veronique Hoste
Proceedings of the Third Workshop on Computational Models of Reference, Anaphora and Coreference

While it has been claimed that anaphora or coreference resolution plays an important role in opinion mining, it is not clear to what extent coreference resolution actually boosts performance, if at all. In this paper, we investigate the potential added value of coreference resolution for the aspect-based sentiment analysis of restaurant reviews in two languages, English and Dutch. We focus on the task of aspect category classification and investigate whether including coreference information prior to classification to resolve implicit aspect mentions is beneficial. Because coreference resolution is not a solved task in NLP, we rely on both automatically-derived and gold-standard coreference relations, allowing us to investigate the true upper bound. By training a classifier on a combination of lexical and semantic features, we show that resolving the coreferential relations prior to classification is beneficial in a joint optimization setup. However, this is only the case when relying on gold-standard relations and the result is more outspoken for English than for Dutch. When validating the optimal models, however, we found that only the Dutch pipeline is able to achieve a satisfying performance on a held-out test set and does so regardless of whether coreference information was included.

pdf bib abs

An Exploratory Study into Automated Précis Grading
Orphee De Clercq | Senne Van Hoecke
Proceedings of the Twelfth Language Resources and Evaluation Conference

Automated writing evaluation is a popular research field, but the main focus has been on evaluating argumentative essays. In this paper, we consider a different genre, namely précis texts. A précis is a written text that provides a coherent summary of main points of a spoken or written text. We present a corpus of English précis texts which all received a grade assigned by a highly-experienced English language teacher and were subsequently annotated following an exhaustive error typology. With this corpus we trained a machine learning model which relies on a number of linguistic, automatic summarization and AWE features. Our results reveal that this model is able to predict the grade of précis texts with only a moderate error margin.

pdf bib abs

An Emotional Mess! Deciding on a Framework for Building a Dutch Emotion-Annotated Corpus
Luna De Bruyne | Orphee De Clercq | Veronique Hoste
Proceedings of the Twelfth Language Resources and Evaluation Conference

Seeing the myriad of existing emotion models, with the categorical versus dimensional opposition the most important dividing line, building an emotion-annotated corpus requires some well thought-out strategies concerning framework choice. In our work on automatic emotion detection in Dutch texts, we investigate this problem by means of two case studies. We find that the labels joy, love, anger, sadness and fear are well-suited to annotate texts coming from various domains and topics, but that the connotation of the labels strongly depends on the origin of the texts. Moreover, it seems that information is lost when an emotional state is forcedly classified in a limited set of categories, indicating that a bi-representational format is desirable when creating an emotion corpus.

2019

pdf bib abs

Benefits of Data Augmentation for NMT-based Text Normalization of User-Generated Content
Claudia Matos Veliz | Orphee De Clercq | Veronique Hoste
Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)

One of the most persistent characteristics of written user-generated content (UGC) is the use of non-standard words. This characteristic contributes to an increased difficulty to automatically process and analyze UGC. Text normalization is the task of transforming lexical variants to their canonical forms and is often used as a pre-processing step for conventional NLP tasks in order to overcome the performance drop that NLP systems experience when applied to UGC. In this work, we follow a Neural Machine Translation approach to text normalization. To train such an encoder-decoder model, large parallel training corpora of sentence pairs are required. However, obtaining large data sets with UGC and their normalized version is not trivial, especially for languages other than English. In this paper, we explore how to overcome this data bottleneck for Dutch, a low-resource language. We start off with a small publicly available parallel Dutch data set comprising three UGC genres and compare two different approaches. The first is to manually normalize and add training data, a money and time-consuming task. The second approach is a set of data augmentation techniques which increase data size by converting existing resources into synthesized non-standard forms. Our results reveal that, while the different approaches yield similar results regarding the normalization issues in the test set, they also introduce a large amount of over-normalizations.

pdf bib abs

Leveraging syntactic parsing to improve event annotation matching
Camiel Colruyt | Orphée De Clercq | Véronique Hoste
Proceedings of the First Workshop on Aggregating and Analysing Crowdsourced Annotations for NLP

Detecting event mentions is the first step in event extraction from text and annotating them is a notoriously difficult task. Evaluating annotator consistency is crucial when building datasets for mention detection. When event mentions are allowed to cover many tokens, annotators may disagree on their span, which means that overlapping annotations may then refer to the same event or to different events. This paper explores different fuzzy-matching functions which aim to resolve this ambiguity. The functions extract the sets of syntactic heads present in the annotations, use the Dice coefficient to measure the similarity between sets and return a judgment based on a given threshold. The functions are tested against the judgment of a human evaluator and a comparison is made between sets of tokens and sets of syntactic heads. The best-performing function is a head-based function that is found to agree with the human evaluator in 89% of cases.

pdf bib abs

Comparing MT Approaches for Text Normalization
Claudia Matos Veliz | Orphee De Clercq | Veronique Hoste
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

One of the main characteristics of social media data is the use of non-standard language. Since NLP tools have been trained on traditional text material their performance drops when applied to social media data. One way to overcome this is to first perform text normalization. In this work, we apply text normalization to noisy English and Dutch text coming from different social media genres: text messages, message board posts and tweets. We consider the normalization task as a Machine Translation problem and test the two leading paradigms: statistical and neural machine translation. For SMT we explore the added value of varying background corpora for training the language model. For NMT we have a look at data augmentation since the parallel datasets we are working with are limited in size. Our results reveal that when relying on SMT to perform the normalization it is beneficial to use a background corpus that is close to the genre you are normalizing. Regarding NMT, we find that the translations - or normalizations - coming out of this model are far from perfect and that for a low-resource language like Dutch adding additional training data works better than artificially augmenting the data.

pdf bib

Proceedings of the Tenth Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis
Alexandra Balahur | Roman Klinger | Veronique Hoste | Carlo Strapparava | Orphee De Clercq
Proceedings of the Tenth Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

pdf bib

Modelling word translation entropy and syntactic equivalence with machine learning
Bram Vanroy | Orphée De Clercq | Lieve Macken
Proceedings of the Second MEMENTO workshop on Modelling Parameters of Cognitive Effort in Translation Production

2018

pdf bib abs

LT3 at SemEval-2018 Task 1: A classifier chain to detect emotions in tweets
Luna De Bruyne | Orphée De Clercq | Véronique Hoste
Proceedings of the 12th International Workshop on Semantic Evaluation

This paper presents an emotion classification system for English tweets, submitted for the SemEval shared task on Affect in Tweets, subtask 5: Detecting Emotions. The system combines lexicon, n-gram, style, syntactic and semantic features. For this multi-class multi-label problem, we created a classifier chain. This is an ensemble of eleven binary classifiers, one for each possible emotion category, where each model gets the predictions of the preceding models as additional features. The predicted labels are combined to get a multi-label representation of the predictions. Our system was ranked eleventh among thirty five participating teams, with a Jaccard accuracy of 52.0% and macro- and micro-average F1-scores of 49.3% and 64.0%, respectively.

pdf bib abs

IEST: WASSA-2018 Implicit Emotions Shared Task
Roman Klinger | Orphée De Clercq | Saif Mohammad | Alexandra Balahur
Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

Past shared tasks on emotions use data with both overt expressions of emotions (I am so happy to see you!) as well as subtle expressions where the emotions have to be inferred, for instance from event descriptions. Further, most datasets do not focus on the cause or the stimulus of the emotion. Here, for the first time, we propose a shared task where systems have to predict the emotions in a large automatically labeled dataset of tweets without access to words denoting emotions. Based on this intention, we call this the Implicit Emotion Shared Task (IEST) because the systems have to infer the emotion mostly from the context. Every tweet has an occurrence of an explicit emotion word that is masked. The tweets are collected in a manner such that they are likely to include a description of the cause of the emotion – the stimulus. Altogether, 30 teams submitted results which range from macro F1 scores of 21 % to 71 %. The baseline (Max-Ent bag of words and bigrams) obtains an F1 score of 60 % which was available to the participants during the development phase. A study with human annotators suggests that automatic methods outperform human predictions, possibly by honing into subtle textual clues not used by humans. Corpora, resources, and results are available at the shared task website at http://implicitemotions.wassa2018.com.

2017

pdf bib

Noise or music? Investigating the usefulness of normalisation for robust sentiment analysis on social media data
Cynthia Van Hee | Marjan Van de Kauter | Orphée De Clercq | Els Lefever | Bart Desmet | Véronique Hoste
Traitement Automatique des Langues, Volume 58, Numéro 1 : Varia [Varia]

pdf bib abs

Towards an integrated pipeline for aspect-based sentiment analysis in various domains
Orphée De Clercq | Els Lefever | Gilles Jacobs | Tijl Carpels | Véronique Hoste
Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

This paper presents an integrated ABSA pipeline for Dutch that has been developed and tested on qualitative user feedback coming from three domains: retail, banking and human resources. The two latter domains provide service-oriented data, which has not been investigated before in ABSA. By performing in-domain and cross-domain experiments the validity of our approach was investigated. We show promising results for the three ABSA subtasks, aspect term extraction, aspect category classification and aspect polarity classification.

2016

pdf bib

All Mixed Up? Finding the Optimal Feature Set for General Readability Prediction and Its Application to English and Dutch
Orphée De Clercq | Véronique Hoste
Computational Linguistics, Volume 42, Issue 3 - September 2016

pdf bib abs

Rude waiter but mouthwatering pastries! An exploratory study into Dutch Aspect-Based Sentiment Analysis
Orphée De Clercq | Véronique Hoste
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The fine-grained task of automatically detecting all sentiment expressions within a given document and the aspects to which they refer is known as aspect-based sentiment analysis. In this paper we present the first full aspect-based sentiment analysis pipeline for Dutch and apply it to customer reviews. To this purpose, we collected reviews from two different domains, i.e. restaurant and smartphone reviews. Both corpora have been manually annotated using newly developed guidelines that comply to standard practices in the field. For our experimental pipeline we perceive aspect-based sentiment analysis as a task consisting of three main subtasks which have to be tackled incrementally: aspect term extraction, aspect category classification and polarity classification. First experiments on our Dutch restaurant corpus reveal that this is indeed a feasible approach that yields promising results.

2015

pdf bib

LT3: Applying Hybrid Terminology Extraction to Aspect-Based Sentiment Analysis
Orphée De Clercq | Marjan Van de Kauter | Els Lefever | Véronique Hoste
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2014

pdf bib abs

Towards Shared Datasets for Normalization Research
Orphée De Clercq | Sarah Schulz | Bart Desmet | Véronique Hoste
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this paper we present a Dutch and English dataset that can serve as a gold standard for evaluating text normalization approaches. With the combination of text messages, message board posts and tweets, these datasets represent a variety of user generated content. All data was manually normalized to their standard form using newly-developed guidelines. We perform automatic lexical normalization experiments on these datasets using statistical machine translation techniques. We focus on both the word and character level and find that we can improve the BLEU score with ca. 20% for both languages. In order for this user generated content data to be released publicly to the research community some issues first need to be resolved. These are discussed in closer detail by focussing on the current legislation and by investigating previous similar data collection projects. With this discussion we hope to shed some light on various difficulties researchers are facing when trying to share social media data.

pdf bib

LT3: Sentiment Classification in User-Generated Content Using a Rich Feature Set
Cynthia Van Hee | Marjan Van de Kauter | Orphée De Clercq | Els Lefever | Véronique Hoste
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

2013

pdf bib

Normalization of Dutch User-Generated Content
Orphée De Clercq | Sarah Schulz | Bart Desmet | Els Lefever | Véronique Hoste
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013

2012

pdf bib abs

Collection of a corpus of Dutch SMS
Maaske Treurniet | Orphée De Clercq | Henk van den Heuvel | Nelleke Oostdijk
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

In this paper we present the first freely available corpus of Dutch text messages containing data originating from the Netherlands and Flanders. This corpus has been collected in the framework of the SoNaR project and constitutes a viable part of this 500-million-word corpus. About 53,000 text messages were collected on a large scale, based on voluntary donations. These messages will be distributed as such. In this paper we focus on the data collection processes involved and after studying the effect of media coverage we show that especially free publicity in newspapers and on social media networks results in more contributions. All SMS are provided with metadata information. Looking at the composition of the corpus, it becomes visible that a small number of people have contributed a large amount of data, in total 272 people have contributed to the corpus during three months. The number of women contributing to the corpus is larger than the number of men, but male contributors submitted larger amounts of data. This corpus will be of paramount importance for sociolinguistic research and normalisation studies.

pdf bib abs

Evaluating automatic cross-domain Dutch semantic role annotation
Orphée De Clercq | Veronique Hoste | Paola Monachesi
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

In this paper we present the first corpus where one million Dutch words from a variety of text genres have been annotated with semantic roles. 500K have been completely manually verified and used as training material to automatically label another 500K. All data has been annotated following an adapted version of the PropBank guidelines. The corpus's rich text type diversity and the availability of manually verified syntactic dependency structures allowed us to experiment with an existing semantic role labeler for Dutch. In order to test the system's portability across various domains, we experimented with training on individual domains and compared this with training on multiple domains by adding more data. Our results show that training on large data sets is necessary but that including genre-specific training material is also crucial to optimize classification. We observed that a small amount of in-domain training data is already sufficient to improve our semantic role labeler.

2011

pdf bib

Cross-Domain Dutch Coreference Resolution
Orphée De Clercq | Véronique Hoste | Iris Hendrickx
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011

2010

pdf bib abs

Data Collection and IPR in Multilingual Parallel Corpora. Dutch Parallel Corpus
Orphée De Clercq | Maribel Montero Perez
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

After three years of work the Dutch Parallel Corpus (DPC) project has reached an end. The finalized corpus is a ten-million-word high-quality sentence-aligned bidirectional parallel corpus of Dutch, English and French, with Dutch as central language. In this paper we present the corpus and try to formulate some basic data collection principles, based on the work that was carried out for the project. Building a corpus is a difficult and time-consuming task, especially when every text sample included has to be cleared from copyrights. The DPC is balanced according to five text types (literature, journalistic texts, instructive texts, administrative texts and texts treating external communication) and four translation directions (Dutch-English, English-Dutch, Dutch-French and French-Dutch). All the text material was cleared from copyrights. The data collection process necessitated the involvement of different text providers, which resulted in drawing up four different licence agreements. Problems such as an unknown source language, copyright issues and changes to the corpus design are discussed in close detail and illustrated with examples so as to be of help to future corpus compilers.

pdf bib abs

Balancing SoNaR: IPR versus Processing Issues in a 500-Million-Word Written Dutch Reference Corpus
Martin Reynaert | Nelleke Oostdijk | Orphée De Clercq | Henk van den Heuvel | Franciska de Jong
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In The Low Countries, a major reference corpus for written Dutch is being built. We discuss the interplay between data acquisition and data processing during the creation of the SoNaR Corpus. Based on developments in traditional corpus compiling and new web harvesting approaches, SoNaR is designed to contain 500 million words, balanced over 36 text types including both traditional and new media texts. Beside its balanced design, every text sample included in SoNaR will have its IPR issues settled to the largest extent possible. This data collection task presents many challenges because every decision taken on the level of text acquisition has ramifications for the level of processing and the general usability of the corpus. As far as the traditional text types are concerned, each text brings its own processing requirements and issues. For new media texts - SMS, chat - the problem is even more complex, issues such as anonimity, recognizability and citation right, all present problems that have to be tackled. The solutions actually lead to the creation of two corpora: a gigaword SoNaR, IPR-cleared for research purposes, and the smaller - of commissioned size - more privacy compliant SoNaR, IPR-cleared for commercial purposes as well.