Veronique Hoste

Also published as: Véronique Hoste


2021

pdf bib
Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis
Orphee De Clercq | Alexandra Balahur | Joao Sedoc | Valentin Barriere | Shabnam Tafreshi | Sven Buechel | Veronique Hoste
Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

pdf bib
Exploring Implicit Sentiment Evoked by Fine-grained News Events
Cynthia Van Hee | Orphee De Clercq | Veronique Hoste
Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

We investigate the feasibility of defining sentiment evoked by fine-grained news events. Our research question is based on the premise that methods for detecting implicit sentiment in news can be a key driver of content diversity, which is one way to mitigate the detrimental effects of filter bubbles that recommenders based on collaborative filtering may produce. Our experiments are based on 1,735 news articles from major Flemish newspapers that were manually annotated, with high agreement, for implicit sentiment. While lexical resources prove insufficient for sentiment analysis in this data genre, our results demonstrate that machine learning models based on SVM and BERT are able to automatically infer the implicit sentiment evoked by news events.

pdf bib
Nearest neighbour approaches for Emotion Detection in Tweets
Olha Kaminska | Chris Cornelis | Veronique Hoste
Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

Emotion detection is an important task that can be applied to social media data to discover new knowledge. While the use of deep learning methods for this task has been prevalent, they are black-box models, making their decisions hard to interpret for a human operator. Therefore, in this paper, we propose an approach using weighted k Nearest Neighbours (kNN), a simple, easy to implement, and explainable machine learning model. These qualities can help to enhance results’ reliability and guide error analysis. In particular, we apply the weighted kNN model to the shared emotion detection task in tweets from SemEval-2018. Tweets are represented using different text embedding methods and emotion lexicon vocabulary scores, and classification is done by an ensemble of weighted kNN models. Our best approaches obtain results competitive with state-of-the-art solutions and open up a promising alternative path to neural network methods.

pdf bib
Emotional RobBERT and Insensitive BERTje: Combining Transformers and Affect Lexica for Dutch Emotion Detection
Luna De Bruyne | Orphee De Clercq | Veronique Hoste
Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

In a first step towards improving Dutch emotion detection, we try to combine the Dutch transformer models BERTje and RobBERT with lexicon-based methods. We propose two architectures: one in which lexicon information is directly injected into the transformer model and a meta-learning approach where predictions from transformers are combined with lexicon features. The models are tested on 1,000 Dutch tweets and 1,000 captions from TV-shows which have been manually annotated with emotion categories and dimensions. We find that RobBERT clearly outperforms BERTje, but that directly adding lexicon information to transformers does not improve performance. In the meta-learning approach, lexicon information does have a positive effect on BERTje, but not on RobBERT. This suggests that more emotional information is already contained within this latter language model.

pdf bib
A Million Tweets Are Worth a Few Points: Tuning Transformers for Customer Service Tasks
Amir Hadifar | Sofie Labat | Veronique Hoste | Chris Develder | Thomas Demeester
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

In online domain-specific customer service applications, many companies struggle to deploy advanced NLP models successfully, due to the limited availability of and noise in their datasets. While prior research demonstrated the potential of migrating large open-domain pretrained models for domain-specific tasks, the appropriate (pre)training strategies have not yet been rigorously evaluated in such social media customer service settings, especially under multilingual conditions. We address this gap by collecting a multilingual social media corpus containing customer service conversations (865k tweets), comparing various pipelines of pretraining and finetuning approaches, applying them on 5 different end tasks. We show that pretraining a generic multilingual transformer model on our in-domain dataset, before finetuning on specific end tasks, consistently boosts performance, especially in non-English settings.

2020

pdf bib
It’s absolutely divine! Can fine-grained sentiment analysis benefit from coreference resolution?
Orphee De Clercq | Veronique Hoste
Proceedings of the Third Workshop on Computational Models of Reference, Anaphora and Coreference

While it has been claimed that anaphora or coreference resolution plays an important role in opinion mining, it is not clear to what extent coreference resolution actually boosts performance, if at all. In this paper, we investigate the potential added value of coreference resolution for the aspect-based sentiment analysis of restaurant reviews in two languages, English and Dutch. We focus on the task of aspect category classification and investigate whether including coreference information prior to classification to resolve implicit aspect mentions is beneficial. Because coreference resolution is not a solved task in NLP, we rely on both automatically-derived and gold-standard coreference relations, allowing us to investigate the true upper bound. By training a classifier on a combination of lexical and semantic features, we show that resolving the coreferential relations prior to classification is beneficial in a joint optimization setup. However, this is only the case when relying on gold-standard relations and the result is more outspoken for English than for Dutch. When validating the optimal models, however, we found that only the Dutch pipeline is able to achieve a satisfying performance on a held-out test set and does so regardless of whether coreference information was included.

pdf bib
Extracting Fine-Grained Economic Events from Business News
Gilles Jacobs | Veronique Hoste
Proceedings of the 1st Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation

Based on a recently developed fine-grained event extraction dataset for the economic domain, we present in a pilot study for supervised economic event extraction. We investigate how a state-of-the-art model for event extraction performs on the trigger and argument identification and classification. While F1-scores of above 50% are obtained on the task of trigger identification, we observe a large gap in performance compared to results on the benchmark ACE05 dataset. We show that single-token triggers do not provide sufficient discriminative information for a fine-grained event detection setup in a closed domain such as economics, since many classes have a large degree of lexico-semantic and contextual overlap.

pdf bib
An Emotional Mess! Deciding on a Framework for Building a Dutch Emotion-Annotated Corpus
Luna De Bruyne | Orphee De Clercq | Veronique Hoste
Proceedings of the 12th Language Resources and Evaluation Conference

Seeing the myriad of existing emotion models, with the categorical versus dimensional opposition the most important dividing line, building an emotion-annotated corpus requires some well thought-out strategies concerning framework choice. In our work on automatic emotion detection in Dutch texts, we investigate this problem by means of two case studies. We find that the labels joy, love, anger, sadness and fear are well-suited to annotate texts coming from various domains and topics, but that the connotation of the labels strongly depends on the origin of the texts. Moreover, it seems that information is lost when an emotional state is forcedly classified in a limited set of categories, indicating that a bi-representational format is desirable when creating an emotion corpus.

pdf bib
LT3 at SemEval-2020 Task 7: Comparing Feature-Based and Transformer-Based Approaches to Detect Funny Headlines
Bram Vanroy | Sofie Labat | Olha Kaminska | Els Lefever | Veronique Hoste
Proceedings of the Fourteenth Workshop on Semantic Evaluation

This paper presents two different systems for the SemEval shared task 7 on Assessing Humor in Edited News Headlines, sub-task 1, where the aim was to estimate the intensity of humor generated in edited headlines. Our first system is a feature-based machine learning system that combines different types of information (e.g. word embeddings, string similarity, part-of-speech tags, perplexity scores, named entity recognition) in a Nu Support Vector Regressor (NuSVR). The second system is a deep learning-based approach that uses the pre-trained language model RoBERTa to learn latent features in the news headlines that are useful to predict the funniness of each headline. The latter system was also our final submission to the competition and is ranked seventh among the 49 participating teams, with a root-mean-square error (RMSE) of 0.5253.

pdf bib
TermEval 2020: Shared Task on Automatic Term Extraction Using the Annotated Corpora for Term Extraction Research (ACTER) Dataset
Ayla Rigouts Terryn | Veronique Hoste | Patrick Drouin | Els Lefever
Proceedings of the 6th International Workshop on Computational Terminology

The TermEval 2020 shared task provided a platform for researchers to work on automatic term extraction (ATE) with the same dataset: the Annotated Corpora for Term Extraction Research (ACTER). The dataset covers three languages (English, French, and Dutch) and four domains, of which the domain of heart failure was kept as a held-out test set on which final f1-scores were calculated. The aim was to provide a large, transparent, qualitatively annotated, and diverse dataset to the ATE research community, with the goal of promoting comparative research and thus identifying strengths and weaknesses of various state-of-the-art methodologies. The results show a lot of variation between different systems and illustrate how some methodologies reach higher precision or recall, how different systems extract different types of terms, how some are exceptionally good at finding rare terms, or are less impacted by term length. The current contribution offers an overview of the shared task with a comparative evaluation, which complements the individual papers by all participants.

2019

pdf bib
Proceedings of the Second Workshop on Economics and Natural Language Processing
Udo Hahn | Véronique Hoste | Zhu Zhang
Proceedings of the Second Workshop on Economics and Natural Language Processing

pdf bib
Benefits of Data Augmentation for NMT-based Text Normalization of User-Generated Content
Claudia Matos Veliz | Orphee De Clercq | Veronique Hoste
Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)

One of the most persistent characteristics of written user-generated content (UGC) is the use of non-standard words. This characteristic contributes to an increased difficulty to automatically process and analyze UGC. Text normalization is the task of transforming lexical variants to their canonical forms and is often used as a pre-processing step for conventional NLP tasks in order to overcome the performance drop that NLP systems experience when applied to UGC. In this work, we follow a Neural Machine Translation approach to text normalization. To train such an encoder-decoder model, large parallel training corpora of sentence pairs are required. However, obtaining large data sets with UGC and their normalized version is not trivial, especially for languages other than English. In this paper, we explore how to overcome this data bottleneck for Dutch, a low-resource language. We start off with a small publicly available parallel Dutch data set comprising three UGC genres and compare two different approaches. The first is to manually normalize and add training data, a money and time-consuming task. The second approach is a set of data augmentation techniques which increase data size by converting existing resources into synthesized non-standard forms. Our results reveal that, while the different approaches yield similar results regarding the normalization issues in the test set, they also introduce a large amount of over-normalizations.

pdf bib
Leveraging syntactic parsing to improve event annotation matching
Camiel Colruyt | Orphée De Clercq | Véronique Hoste
Proceedings of the First Workshop on Aggregating and Analysing Crowdsourced Annotations for NLP

Detecting event mentions is the first step in event extraction from text and annotating them is a notoriously difficult task. Evaluating annotator consistency is crucial when building datasets for mention detection. When event mentions are allowed to cover many tokens, annotators may disagree on their span, which means that overlapping annotations may then refer to the same event or to different events. This paper explores different fuzzy-matching functions which aim to resolve this ambiguity. The functions extract the sets of syntactic heads present in the annotations, use the Dice coefficient to measure the similarity between sets and return a judgment based on a given threshold. The functions are tested against the judgment of a human evaluator and a comparison is made between sets of tokens and sets of syntactic heads. The best-performing function is a head-based function that is found to agree with the human evaluator in 89% of cases.

pdf bib
Proceedings of the Tenth Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis
Alexandra Balahur | Roman Klinger | Veronique Hoste | Carlo Strapparava | Orphee De Clercq
Proceedings of the Tenth Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

pdf bib
LT3 at SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter (hatEval)
Nina Bauwelinck | Gilles Jacobs | Véronique Hoste | Els Lefever
Proceedings of the 13th International Workshop on Semantic Evaluation

This paper describes our contribution to the SemEval-2019 Task 5 on the detection of hate speech against immigrants and women in Twitter (hatEval). We considered a supervised classification-based approach to detect hate speech in English tweets, which combines a variety of standard lexical and syntactic features with specific features for capturing offensive language. Our experimental results show good classification performance on the training data, but a considerable drop in recall on the held-out test set.

pdf bib
Comparing MT Approaches for Text Normalization
Claudia Matos Veliz | Orphee De Clercq | Veronique Hoste
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

One of the main characteristics of social media data is the use of non-standard language. Since NLP tools have been trained on traditional text material their performance drops when applied to social media data. One way to overcome this is to first perform text normalization. In this work, we apply text normalization to noisy English and Dutch text coming from different social media genres: text messages, message board posts and tweets. We consider the normalization task as a Machine Translation problem and test the two leading paradigms: statistical and neural machine translation. For SMT we explore the added value of varying background corpora for training the language model. For NMT we have a look at data augmentation since the parallel datasets we are working with are limited in size. Our results reveal that when relying on SMT to perform the normalization it is beneficial to use a background corpus that is close to the genre you are normalizing. Regarding NMT, we find that the translations - or normalizations - coming out of this model are far from perfect and that for a low-resource language like Dutch adding additional training data works better than artificially augmenting the data.

pdf bib
Analysing the Impact of Supervised Machine Learning on Automatic Term Extraction: HAMLET vs TermoStat
Ayla Rigouts Terryn | Patrick Drouin | Veronique Hoste | Els Lefever
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

Traditional approaches to automatic term extraction do not rely on machine learning (ML) and select the top n ranked candidate terms or candidate terms above a certain predefined cut-off point, based on a limited number of linguistic and statistical clues. However, supervised ML approaches are gaining interest. Relatively little is known about the impact of these supervised methodologies; evaluations are often limited to precision, and sometimes recall and f1-scores, without information about the nature of the extracted candidate terms. Therefore, the current paper presents a detailed and elaborate analysis and comparison of a traditional, state-of-the-art system (TermoStat) and a new, supervised ML approach (HAMLET), using the results obtained for the same, manually annotated, Dutch corpus about dressage.

2018

pdf bib
We Usually Don’t Like Going to the Dentist: Using Common Sense to Detect Irony on Twitter
Cynthia Van Hee | Els Lefever | Véronique Hoste
Computational Linguistics, Volume 44, Issue 4 - December 2018

Although common sense and connotative knowledge come naturally to most people, computers still struggle to perform well on tasks for which such extratextual information is required. Automatic approaches to sentiment analysis and irony detection have revealed that the lack of such world knowledge undermines classification performance. In this article, we therefore address the challenge of modeling implicit or prototypical sentiment in the framework of automatic irony detection. Starting from manually annotated connoted situation phrases (e.g., “flight delays,” “sitting the whole day at the doctor’s office”), we defined the implicit sentiment held towards such situations automatically by using both a lexico-semantic knowledge base and a data-driven method. We further investigate how such implicit sentiment information affects irony detection by assessing a state-of-the-art irony classifier before and after it is informed with implicit sentiment information.

pdf bib
A Gold Standard for Multilingual Automatic Term Extraction from Comparable Corpora: Term Structure and Translation Equivalents
Ayla Rigouts Terryn | Véronique Hoste | Els Lefever
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Proceedings of the First Workshop on Economics and Natural Language Processing
Udo Hahn | Véronique Hoste | Ming-Feng Tsai
Proceedings of the First Workshop on Economics and Natural Language Processing

pdf bib
Economic Event Detection in Company-Specific News Text
Gilles Jacobs | Els Lefever | Véronique Hoste
Proceedings of the First Workshop on Economics and Natural Language Processing

This paper presents a dataset and supervised classification approach for economic event detection in English news articles. Currently, the economic domain is lacking resources and methods for data-driven supervised event detection. The detection task is conceived as a sentence-level classification task for 10 different economic event types. Two different machine learning approaches were tested: a rich feature set Support Vector Machine (SVM) set-up and a word-vector-based long short-term memory recurrent neural network (RNN-LSTM) set-up. We show satisfactory results for most event types, with the linear kernel SVM outperforming the other experimental set-ups

pdf bib
Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis
Alexandra Balahur | Saif M. Mohammad | Veronique Hoste | Roman Klinger
Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

pdf bib
SemEval-2018 Task 3: Irony Detection in English Tweets
Cynthia Van Hee | Els Lefever | Véronique Hoste
Proceedings of The 12th International Workshop on Semantic Evaluation

This paper presents the first shared task on irony detection: given a tweet, automatic natural language processing systems should determine whether the tweet is ironic (Task A) and which type of irony (if any) is expressed (Task B). The ironic tweets were collected using irony-related hashtags (i.e. #irony, #sarcasm, #not) and were subsequently manually annotated to minimise the amount of noise in the corpus. Prior to distributing the data, hashtags that were used to collect the tweets were removed from the corpus. For both tasks, a training corpus of 3,834 tweets was provided, as well as a test set containing 784 tweets. Our shared tasks received submissions from 43 teams for the binary classification Task A and from 31 teams for the multiclass Task B. The highest classification scores obtained for both subtasks are respectively F1= 0.71 and F1= 0.51 and demonstrate that fine-grained irony classification is much more challenging than binary irony detection.

pdf bib
LT3 at SemEval-2018 Task 1: A classifier chain to detect emotions in tweets
Luna De Bruyne | Orphée De Clercq | Véronique Hoste
Proceedings of The 12th International Workshop on Semantic Evaluation

This paper presents an emotion classification system for English tweets, submitted for the SemEval shared task on Affect in Tweets, subtask 5: Detecting Emotions. The system combines lexicon, n-gram, style, syntactic and semantic features. For this multi-class multi-label problem, we created a classifier chain. This is an ensemble of eleven binary classifiers, one for each possible emotion category, where each model gets the predictions of the preceding models as additional features. The predicted labels are combined to get a multi-label representation of the predictions. Our system was ranked eleventh among thirty five participating teams, with a Jaccard accuracy of 52.0% and macro- and micro-average F1-scores of 49.3% and 64.0%, respectively.

2017

pdf bib
Towards an integrated pipeline for aspect-based sentiment analysis in various domains
Orphée De Clercq | Els Lefever | Gilles Jacobs | Tijl Carpels | Véronique Hoste
Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

This paper presents an integrated ABSA pipeline for Dutch that has been developed and tested on qualitative user feedback coming from three domains: retail, banking and human resources. The two latter domains provide service-oriented data, which has not been investigated before in ABSA. By performing in-domain and cross-domain experiments the validity of our approach was investigated. We show promising results for the three ABSA subtasks, aspect term extraction, aspect category classification and aspect polarity classification.

2016

pdf bib
A Classification-based Approach to Economic Event Detection in Dutch News Text
Els Lefever | Véronique Hoste
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Breaking news on economic events such as stock splits or mergers and acquisitions has been shown to have a substantial impact on the financial markets. As it is important to be able to automatically identify events in news items accurately and in a timely manner, we present in this paper proof-of-concept experiments for a supervised machine learning approach to economic event detection in newswire text. For this purpose, we created a corpus of Dutch financial news articles in which 10 types of company-specific economic events were annotated. We trained classifiers using various lexical, syntactic and semantic features. We obtain good results based on a basic set of shallow features, thus showing that this method is a viable approach for economic event detection in news text.

pdf bib
Exploring the Realization of Irony in Twitter Data
Cynthia Van Hee | Els Lefever | Véronique Hoste
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Handling figurative language like irony is currently a challenging task in natural language processing. Since irony is commonly used in user-generated content, its presence can significantly undermine accurate analysis of opinions and sentiment in such texts. Understanding irony is therefore important if we want to push the state-of-the-art in tasks such as sentiment analysis. In this research, we present the construction of a Twitter dataset for two languages, being English and Dutch, and the development of new guidelines for the annotation of verbal irony in social media texts. Furthermore, we present some statistics on the annotated corpora, from which we can conclude that the detection of contrasting evaluations might be a good indicator for recognizing irony.

pdf bib
Rude waiter but mouthwatering pastries! An exploratory study into Dutch Aspect-Based Sentiment Analysis
Orphée De Clercq | Véronique Hoste
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The fine-grained task of automatically detecting all sentiment expressions within a given document and the aspects to which they refer is known as aspect-based sentiment analysis. In this paper we present the first full aspect-based sentiment analysis pipeline for Dutch and apply it to customer reviews. To this purpose, we collected reviews from two different domains, i.e. restaurant and smartphone reviews. Both corpora have been manually annotated using newly developed guidelines that comply to standard practices in the field. For our experimental pipeline we perceive aspect-based sentiment analysis as a task consisting of three main subtasks which have to be tackled incrementally: aspect term extraction, aspect category classification and polarity classification. First experiments on our Dutch restaurant corpus reveal that this is indeed a feasible approach that yields promising results.

pdf bib
All Mixed Up? Finding the Optimal Feature Set for General Readability Prediction and Its Application to English and Dutch
Orphée De Clercq | Véronique Hoste
Computational Linguistics, Volume 42, Issue 3 - September 2016

pdf bib
Mental Distress Detection and Triage in Forum Posts: The LT3 CLPsych 2016 Shared Task System
Bart Desmet | Gilles Jacobs | Véronique Hoste
Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology

pdf bib
UGENT-LT3 SCATE Submission for WMT16 Shared Task on Quality Estimation
Arda Tezcan | Véronique Hoste | Lieve Macken
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

pdf bib
Detecting Grammatical Errors in Machine Translation Output Using Dependency Parsing and Treebank Querying
Arda Tezcan | Veronique Hoste | Lieve Macken
Proceedings of the 19th Annual Conference of the European Association for Machine Translation

pdf bib
Monday mornings are my fave :) #not Exploring the Automatic Recognition of Irony in English tweets
Cynthia Van Hee | Els Lefever | Véronique Hoste
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Recognising and understanding irony is crucial for the improvement natural language processing tasks including sentiment analysis. In this study, we describe the construction of an English Twitter corpus and its annotation for irony based on a newly developed fine-grained annotation scheme. We also explore the feasibility of automatic irony recognition by exploiting a varied set of features including lexical, syntactic, sentiment and semantic (Word2Vec) information. Experiments on a held-out test set show that our irony classifier benefits from this combined information, yielding an F1-score of 67.66%. When explicit hashtag information like #irony is included in the data, the system even obtains an F1-score of 92.77%. A qualitative analysis of the output reveals that recognising irony that results from a polarity clash appears to be (much) more feasible than recognising other forms of ironic utterances (e.g., descriptions of situational irony).

pdf bib
SemEval-2016 Task 5: Aspect Based Sentiment Analysis
Maria Pontiki | Dimitris Galanis | Haris Papageorgiou | Ion Androutsopoulos | Suresh Manandhar | Mohammad AL-Smadi | Mahmoud Al-Ayyoub | Yanyan Zhao | Bing Qin | Orphée De Clercq | Véronique Hoste | Marianna Apidianaki | Xavier Tannier | Natalia Loukachevitch | Evgeniy Kotelnikov | Nuria Bel | Salud María Jiménez-Zafra | Gülşen Eryiğit
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
UGENT-LT3 SCATE System for Machine Translation Quality Estimation
Arda Tezcan | Veronique Hoste | Bart Desmet | Lieve Macken
Proceedings of the Tenth Workshop on Statistical Machine Translation

pdf bib
Smart Computer Aided Translation Environment - SCATE
Vincent Vandeghinste | Tom Vanallemeersch | Frank Van Eynde | Geert Heyman | Sien Moens | Joris Pelemans | Patrick Wambacq | Iulianna Van der Lek - Ciudin | Arda Tezcan | Lieve Macken | Véronique Hoste | Eva Geurts | Mieke Haesen
Proceedings of the 18th Annual Conference of the European Association for Machine Translation

pdf bib
Smart Computer Aided Translation Environment
Vincent Vandeghinste | Tom Vanallemeersch | Frank Van Eynde | Geert Heyman | Sien Moens | Joris Pelemans | Patrick Wambacq | Iulianna Van der Lek - Ciudin | Arda Tezcan | Lieve Macken | Véronique Hoste | Eva Geurts | Mieke Haesen
Proceedings of the 18th Annual Conference of the European Association for Machine Translation

pdf bib
LT3: Sentiment Analysis of Figurative Tweets: piece of cake #NotReally
Cynthia Van Hee | Els Lefever | Véronique Hoste
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
LT3: Applying Hybrid Terminology Extraction to Aspect-Based Sentiment Analysis
Orphée De Clercq | Marjan Van de Kauter | Els Lefever | Véronique Hoste
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
Detection and Fine-Grained Classification of Cyberbullying Events
Cynthia Van Hee | Els Lefever | Ben Verhoeven | Julie Mennes | Bart Desmet | Guy De Pauw | Walter Daelemans | Veronique Hoste
Proceedings of the International Conference Recent Advances in Natural Language Processing

2014

pdf bib
SemEval 2014 Task 5 - L2 Writing Assistant
Maarten van Gompel | Iris Hendrickx | Antal van den Bosch | Els Lefever | Véronique Hoste
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib
LT3: Sentiment Classification in User-Generated Content Using a Rich Feature Set
Cynthia Van Hee | Marjan Van de Kauter | Orphée De Clercq | Els Lefever | Véronique Hoste
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib
Evaluation of Automatic Hypernym Extraction from Technical Corpora in English and Dutch
Els Lefever | Marjan Van de Kauter | Véronique Hoste
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this research, we evaluate different approaches for the automatic extraction of hypernym relations from English and Dutch technical text. The detected hypernym relations should enable us to semantically structure automatically obtained term lists from domain- and user-specific data. We investigated three different hypernymy extraction approaches for Dutch and English: a lexico-syntactic pattern-based approach, a distributional model and a morpho-syntactic method. To test the performance of the different approaches on domain-specific data, we collected and manually annotated English and Dutch data from two technical domains, viz. the dredging and financial domain. The experimental results show that especially the morpho-syntactic approach obtains good results for automatic hypernym extraction from technical and domain-specific texts.

pdf bib
Recognising suicidal messages in Dutch social media
Bart Desmet | Véronique Hoste
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Early detection of suicidal thoughts is an important part of effective suicide prevention. Such thoughts may be expressed online, especially by young people. This paper presents on-going work on the automatic recognition of suicidal messages in social media. We present experiments for automatically detecting relevant messages (with suicide-related content), and those containing suicide threats. A sample of 1357 texts was annotated in a corpus of 2674 blog posts and forum messages from Netlog, indicating relevance, origin, severity of suicide threat and risks as well as protective factors. For the classification experiments, Naive Bayes, SVM and KNN algorithms are combined with shallow features, i.e. bag-of-words of word, lemma and character ngrams, and post length. The best relevance classification is achieved by using SVM with post length, lemma and character ngrams, resulting in an F-score of 85.6% (78.7% precision and 93.8% recall). For the second task (threat detection), a cascaded setup which first filters out irrelevant messages with SVM and then predicts the severity with KNN, performs best: 59.2% F-score (69.5% precision and 51.6% recall).

pdf bib
Towards Shared Datasets for Normalization Research
Orphée De Clercq | Sarah Schulz | Bart Desmet | Véronique Hoste
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this paper we present a Dutch and English dataset that can serve as a gold standard for evaluating text normalization approaches. With the combination of text messages, message board posts and tweets, these datasets represent a variety of user generated content. All data was manually normalized to their standard form using newly-developed guidelines. We perform automatic lexical normalization experiments on these datasets using statistical machine translation techniques. We focus on both the word and character level and find that we can improve the BLEU score with ca. 20% for both languages. In order for this user generated content data to be released publicly to the research community some issues first need to be resolved. These are discussed in closer detail by focussing on the current legislation and by investigating previous similar data collection projects. With this discussion we hope to shed some light on various difficulties researchers are facing when trying to share social media data.

2013

pdf bib
Normalization of Dutch User-Generated Content
Orphée De Clercq | Sarah Schulz | Bart Desmet | Els Lefever | Véronique Hoste
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013

pdf bib
A Combined Pattern-based and Distributional Approach for Automatic Hypernym Detection in Dutch.
Gwendolijn Schropp | Els Lefever | Véronique Hoste
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013

pdf bib
SemEval-2013 Task 10: Cross-lingual Word Sense Disambiguation
Els Lefever | Véronique Hoste
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

2012

pdf bib
From Character to Word Level: Enabling the Linguistic Analyses of Inputlog Process Data
Mariëlle Leijten | Lieve Macken | Veronique Hoste | Eric Van Horenbeeck | Luuk Van Waes
Proceedings of the Second Workshop on Computational Linguistics and Writing (CL&W 2012): Linguistic and Cognitive Aspects of Document Creation and Document Engineering

pdf bib
From keystrokes to annotated process data: Enriching the output of Inputlog with linguistic information
Lieve Macken | Veronique Hoste | Mariëlle Leijten | Luuk Van Waes
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Keystroke logging tools are a valuable aid to monitor written language production. These tools record all keystrokes, including backspaces and deletions together with timing information. In this paper we report on an extension to the keystroke logging program Inputlog in which we aggregate the logged process data from the keystroke (character) level to the word level. The logged process data are further enriched with different kinds of linguistic information: part-of-speech tags, lemmata, chunk boundaries, syllable boundaries and word frequency. A dedicated parser has been developed that distils from the logged process data word-level revisions, deleted fragments and final product data. The linguistically-annotated output will facilitate the linguistic analysis of the logged data and will provide a valuable basis for more linguistically-oriented writing process research. The set-up of the extension to Inputlog is largely language-independent. As proof-of-concept, the extension has been developed for English and Dutch. Inputlog is freely available for research purposes.

pdf bib
Discovering Missing Wikipedia Inter-language Links by means of Cross-lingual Word Sense Disambiguation
Els Lefever | Véronique Hoste | Martine De Cock
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Wikipedia pages typically contain inter-language links to the corresponding pages in other languages. These links, however, are often incomplete. This paper describes a set of experiments in which the viability of discovering such missing inter-language links for ambiguous nouns by means of a cross-lingual Word Sense Disambiguation approach is investigated. The input for the inter-language link detection system is a set of Dutch pages for a given ambiguous noun and the output of the system is a set of links to the corresponding pages in three target languages (viz. French, Spanish and Italian). The experimental results show that although it is a very challenging task, the system succeeds to detect missing inter-language links between Wikipedia documents for a manually labeled test set. The final goal of the system is to provide a human editor with a list of possible missing links that should be manually verified.

pdf bib
Evaluating automatic cross-domain Dutch semantic role annotation
Orphée De Clercq | Veronique Hoste | Paola Monachesi
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

In this paper we present the first corpus where one million Dutch words from a variety of text genres have been annotated with semantic roles. 500K have been completely manually verified and used as training material to automatically label another 500K. All data has been annotated following an adapted version of the PropBank guidelines. The corpus's rich text type diversity and the availability of manually verified syntactic dependency structures allowed us to experiment with an existing semantic role labeler for Dutch. In order to test the system's portability across various domains, we experimented with training on individual domains and compared this with training on multiple domains by adding more data. Our results show that training on large data sets is necessary but that including genre-specific training material is also crucial to optimize classification. We observed that a small amount of in-domain training data is already sufficient to improve our semantic role labeler.

pdf bib
Beyond SoNaR: towards the facilitation of large corpus building efforts
Martin Reynaert | Ineke Schuurman | Véronique Hoste | Nelleke Oostdijk | Maarten van Gompel
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

In this paper we report on the experiences gained in the recent construction of the SoNaR corpus, a 500 MW reference corpus of contemporary, written Dutch. It shows what can realistically be done within the confines of a project setting where there are limitations to the duration in time as well to the budget, employing current state-of-the-art tools, standards and best practices. By doing so we aim to pass on insights that may be beneficial for anyone considering to undertake an effort towards building a large, varied yet balanced corpus for use by the wider research community. Various issues are discussed that come into play while compiling a large corpus, including approaches to acquiring texts, the arrangement of IPR, the choice of text formats, and steps to be taken in the preprocessing of data from widely different origins. We describe FoLiA, a new XML format geared at rich linguistic annotations. We also explain the rationale behind the investment in the high-quali ty semi-automatic enrichment of a relatively small (1 MW) subset with very rich syntactic and semantic annotations. Finally, we present some ideas about future developments and the direction corpus development may take, such as setting up an integrated work flow between web services and the potential role for ISOcat. We list tips for potential corpus builders, tricks they may want to try and further recommendations regarding technical developments future corpus builders may wish to hope for.

2011

pdf bib
An Evaluation and Possible Improvement Path for Current SMT Behavior on Ambiguous Nouns
Els Lefever | Véronique Hoste
Proceedings of Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation

pdf bib
Readability Annotation: Replacing the Expert by the Crowd
Philip van Oosten | Véronique Hoste
Proceedings of the Sixth Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
ParaSense or How to Use Parallel Corpora for Word Sense Disambiguation
Els Lefever | Véronique Hoste | Martine De Cock
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Cross-Domain Dutch Coreference Resolution
Orphée De Clercq | Véronique Hoste | Iris Hendrickx
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011

2010

pdf bib
SemEval-2010 Task 1: Coreference Resolution in Multiple Languages
Marta Recasens | Lluís Màrquez | Emili Sapena | M. Antònia Martí | Mariona Taulé | Véronique Hoste | Massimo Poesio | Yannick Versley
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf bib
SemEval-2010 Task 3: Cross-Lingual Word Sense Disambiguation
Els Lefever | Veronique Hoste
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf bib
Construction of a Benchmark Data Set for Cross-lingual Word Sense Disambiguation
Els Lefever | Véronique Hoste
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Given the recent trend to evaluate the performance of word sense disambiguation systems in a more application-oriented set-up, we report on the construction of a multilingual benchmark data set for cross-lingual word sense disambiguation. The data set was created for a lexical sample of 25 English nouns, for which translations were retrieved in 5 languages, namely Dutch, German, French, Italian and Spanish. The corpus underlying the sense inventory was the parallel data set Europarl. The gold standard sense inventory was based on the automatic word alignments of the parallel corpus, which were manually verified. The resulting word alignments were used to perform a manual clustering of the translations over all languages in the parallel corpus. The inventory then served as input for the annotators of the sentences, who were asked to provide a maximum of three contextually relevant translations per language for a given focus word. The data set was released in the framework of the SemEval-2010 competition.

pdf bib
Interacting Semantic Layers of Annotation in SoNaR, a Reference Corpus of Contemporary Written Dutch
Ineke Schuurman | Véronique Hoste | Paola Monachesi
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper reports on the annotation of a corpus of 1 million words with four semantic annotation layers, including named entities, co- reference relations, semantic roles and spatial and temporal expressions. These semantic annotation layers can benefit from the manually verified part of speech tagging, lemmatization and syntactic analysis (dependency tree) information layers which resulted from an earlier project (Van Noord et al., 2006) and will thus result in a deeply syntactically and semantically annotated corpus. This annotation effort is carried out in the framework of a larger project which aims at the collection of a 500-million word corpus of contemporary Dutch, covering the variants used in the Netherlands and Flanders, the Dutch speaking part of Belgium. All the annotation schemes used were (co-)developed by the authors within the Flemish-Dutch STEVIN-programme as no previous schemes for Dutch were available. They were created taking into account standards (either de facto or official (like ISO)) used elsewhere.

pdf bib
Towards a Balanced Named Entity Corpus for Dutch
Bart Desmet | Véronique Hoste
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper introduces a new named entity corpus for Dutch. State-of-the-art named entity recognition systems require a substantial annotated corpus to be trained on. Such corpora exist for English, but not for Dutch. The STEVIN-funded SoNaR project aims to produce a diverse 500-million-word reference corpus of written Dutch, with four semantic annotation layers: named entities, coreference relations, semantic roles and spatiotemporal expressions. A 1-million-word subset will be manually corrected. Named entity annotation guidelines for Dutch were developed, adapted from the MUC and ACE guidelines. Adaptations include the annotation of products and events, the classification into subtypes, and the markup of metonymic usage. Inter-annotator agreement experiments were conducted to corroborate the reliability of the guidelines, which yielded satisfactory results (Kappa scores above 0.90). We are building a NER system, trained on the 1-million-word subcorpus, to automatically classify the remainder of the SoNaR corpus. To this end, experiments with various classification algorithms (MBL, SVM, CRF) and features have been carried out and evaluated.

pdf bib
Towards an Improved Methodology for Automated Readability Prediction
Philip van Oosten | Dries Tanghe | Véronique Hoste
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Since the first half of the 20th century, readability formulas have been widely employed to automatically predict the readability of an unseen text. In this article, the formulas and the text characteristics they are composed of are evaluated in the context of large Dutch and English corpora. We describe the behaviour of the formulas and the text characteristics by means of correlation matrices and a principal component analysis, and test the methodological validity of the formulas by means of collinearity tests. Both the correlation matrices and the principal component analysis show that the formulas described in this paper strongly correspond, regardless of the language for which they were designed. Furthermore, the collinearity test reveals shortcomings in the methodology that was used to create some of the existing readability formulas. All of this leads us to conclude that a new readability prediction method is needed. We finally make suggestions to come to a cleaner methodology and present web applications that will help us collect data to compile a new gold standard for readability prediction.

pdf bib
Towards a Learning Approach for Abbreviation Detection and Resolution.
Klaar Vanopstal | Bart Desmet | Véronique Hoste
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

The explosion of biomedical literature and with it the -uncontrolled- creation of abbreviations presents some special challenges for both human readers and computer applications. We developed an annotated corpus of Dutch medical text, and experimented with two approaches to abbreviation detection and resolution. Our corpus is composed of abstracts from two medical journals from the Low Countries in which approximately 65 percent (NTvG) and 48 percent (TvG) of the abbreviations have a corresponding full form in the abstract. Our first approach, a pattern-based system, consists of two steps: abbreviation detection and definition matching. This system has an average F-score of 0.82 for the detection of both defined and undefined abbreviations and an average F-score of 0.77 was obtained for the definitions. For our second approach, an SVM-based classifier was used on the preprocessed data sets, leading to an average F-score of 0.93 for the abbreviations; for the definitions an average F-score of 0.82 was obtained.

2009

pdf bib
Language-Independent Bilingual Terminology Extraction from a Multilingual Parallel Corpus
Els Lefever | Lieve Macken | Veronique Hoste
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf bib
SemEval-2010 Task 3: Cross-lingual Word Sense Disambiguation
Els Lefever | Veronique Hoste
Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions (SEW-2009)

2008

pdf bib
Linguistically-Based Sub-Sentential Alignment for Terminology Extraction from a Bilingual Automotive Corpus
Lieve Macken | Els Lefever | Veronique Hoste
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
A Coreference Corpus and Resolution System for Dutch
Iris Hendrickx | Gosse Bouma | Frederik Coppens | Walter Daelemans | Veronique Hoste | Geert Kloosterman | Anne-Marie Mineur | Joeri Van Der Vloet | Jean-Luc Verschelde
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We present the main outcomes of the COREA project: a corpus annotated with coreferential relations and a coreference resolution system for Dutch. In the project we developed annotation guidelines for coreference resolution for Dutch and annotated a corpus of 135K tokens. We discuss these guidelines, the annotation tool, and the inter-annotator agreement. We also show a visualization of the annotated relations. The standard approach to evaluate a coreference resolution system is to compare the predictions of the system to a hand-annotated gold standard test set (cross-validation). A more practically oriented evaluation is to test the usefulness of coreference relation information in an NLP application. We run experiments with an Information Extraction module for the medical domain, and measure the performance of this module with and without the coreference relation information. We present the results of both this application-oriented evaluation of our system and of a standard cross-validation evaluation. In a separate experiment we also evaluate the effect of coreference information produced by a simple rule-based coreference module in a Question Answering application.

pdf bib
Learning-based Detection of Scientific Terms in Patient Information
Veronique Hoste | Els Lefever | Klaar Vanopstal | Isabelle Delaere
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In this paper, we investigate the use of a machine-learning based approach to the specific problem of scientific term detection in patient information. Lacking lexical databases which differentiate between the scientific and popular nature of medical terms, we used local context, morphosyntactic, morphological and statistical information to design a learner which accurately detects scientific medical terms. This study is the first step towards the automatic replacement of a scientific term by its popular counterpart, which should have a beneficial effect on readability. We show a F-score of 84% for the prediction of scientific terms in an English and Dutch EPAR corpus. Since recasting the term extraction problem as a classification problem leads to a large skewedness of the resulting data set, we rebalanced the data set through the application of some simple TF-IDF-based and Log-likelihood-based filters. We show that filtering indeed has a beneficial effect on the learner’s performance. However, the results of the filtering approach combined with the learning-based approach remain below those of the learning-based approach.

2007

pdf bib
AUG: A combined classification and clustering approach for web people disambiguation
Els Lefever | Véronique Hoste | Timur Fayruzov
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

2006

pdf bib
KNACK-2002: a Richly Annotated Corpus of Dutch Written Text
Véronique Hoste | Guy De Pauw
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

In this paper, we introduce the annotated KNACK-2002 corpus of Dutch written text. The corpus features five different annotation layers, ranging from the annotation of morphological boundaries at the word level, over the annotation of part-of-speech tags and phrase chunks at the syntactic level to the annotation of named entities at the semantic level and coreferential relations at the discourse level. We believe the corpus is unique in the Dutch language area because of its richness of annotation layers, providing researchers with a useful gold standard data set for different NLP tasks in the domains of morphology, (morpho)syntax, semantics and discourse.

2004

pdf bib
GAMBL, genetic algorithm optimization of memory-based WSD
Bart Decadt | Véronique Hoste | Walter Daelemans | Antal van den Bosch
Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text

2003

pdf bib
Learning to Predict Pitch Accents and Prosodic Boundaries in Dutch
Erwin Marsi | Martin Reynaert | Antal van den Bosch | Walter Daelemans | Véronique Hoste
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

2002

pdf bib
Evaluation of Machine Learning Methods for Natural Language Processing Tasks
Walter Daelemans | Véronique Hoste
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf bib
Dutch Word Sense Disambiguation: Optimizing the Localness of Context
Antal van den Bosch | Iris Hendrickx | Veronique Hoste | Walter Daelemans
Proceedings of the ACL-02 Workshop on Word Sense Disambiguation: Recent Successes and Future Directions

pdf bib
Evaluating the results of a memory-based word-expert approach to unrestricted word sense disambiguation
Veronique Hoste | Walter Daelemans | Iris Hendrickx | Antal van den Bosch
Proceedings of the ACL-02 Workshop on Word Sense Disambiguation: Recent Successes and Future Directions

2001

pdf bib
Classifier Optimization and Combination in the English All Words Task
Véronique Hoste | Anne Kool | Walter Daelemans
Proceedings of SENSEVAL-2 Second International Workshop on Evaluating Word Sense Disambiguation Systems

2000

pdf bib
A Rule Induction Approach to Modeling Regional Pronunciation Variation
Veronique Hoste | Steven Gillis | Walter Daelemans
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics

Search
Co-authors