Erik Tjong Kim Sang

Also published as: Erik F. Tjong Kim Sang

2020

pdf bib abs
Public Sentiment on Governmental COVID-19 Measures in Dutch Social Media
Shihan Wang | Marijn Schraagen | Erik Tjong Kim Sang | Mehdi Dastani
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020

Public sentiment (the opinion, attitude or feeling that the public expresses) is a factor of interest for government, as it directly influences the implementation of policies. Given the unprecedented nature of the COVID-19 crisis, having an up-to-date representation of public sentiment on governmental measures and announcements is crucial. In this paper, we analyse Dutch public sentiment on governmental COVID-19 measures from text data collected across three online media sources (Twitter, Reddit and Nu.nl) from February to September 2020. We apply sentiment analysis methods to analyse polarity over time, as well as to identify stance towards two specific pandemic policies regarding social distancing and wearing face masks. The presented preliminary results provide valuable insights into the narratives shown in vast social media text data, which help understand the influence of COVID-19 measures on the general public.

2016

pdf bib abs
Nederlab: Towards a Single Portal and Research Environment for Diachronic Dutch Text Corpora
Hennie Brugman | Martin Reynaert | Nicoline van der Sijs | René van Stipriaan | Erik Tjong Kim Sang | Antal van den Bosch
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The Nederlab project aims to bring together all digitized texts relevant to the Dutch national heritage, the history of the Dutch language and culture (circa 800 – present) in one user friendly and tool enriched open access web interface. This paper describes Nederlab halfway through the project period and discusses the collections incorporated, back-office processes, system back-end as well as the Nederlab Research Portal end-user web application.

pdf bib abs
Finding Rising and Falling Words
Erik Tjong Kim Sang
Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH)

We examine two different methods for finding rising words (among which neologisms) and falling words (among which archaisms) in decades of magazine texts (millions of words) and in years of tweets (billions of words): one based on correlation coefficients of relative frequencies and time, and one based on comparing initial and final word frequencies of time intervals. We find that smoothing frequency scores improves the precision scores of both methods and that the correlation coefficients perform better on magazine text but worse on tweets. Since the two ranking methods find different words they can be used in side-by-side to study the behavior of words over time.

2012

pdf bib
Predicting the 2011 Dutch Senate Election Results with Twitter
Erik Tjong Kim Sang | Johan Bos
Proceedings of the Workshop on Semantic Analysis in Social Media

2010

In this paper we describe GikiCLEF, the first evaluation contest that, to our knowledge, was specifically designed to expose and investigate cultural and linguistic issues involved in structured multimedia collections and searching, and which was organized under the scope of CLEF 2009. GikiCLEF evaluated systems that answered hard questions for both human and machine, in ten different Wikipedia collections, namely Bulgarian, Dutch, English, German, Italian, Norwegian (Bokmäl and Nynorsk), Portuguese, Romanian, and Spanish. After a short historical introduction, we present the task, together with its motivation, and discuss how the topics were chosen. Then we provide another description from the point of view of the participants. Before disclosing their results, we introduce the SIGA management system explaining the several tasks which were carried out behind the scenes. We quantify in turn the GIRA resource, offered to the community for training and further evaluating systems with the help of the 50 topics gathered and the solutions identified. We end the paper with a critical discussion of what was learned, advancing possible ways to reuse the data.

pdf bib
A Baseline Approach for Detecting Sentences Containing Uncertainty
Erik Tjong Kim Sang
Proceedings of the Fourteenth Conference on Computational Natural Language Learning – Shared Task