Vanni Zavarella


2021

pdf bib
Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021): Workshop and Shared Task Report
Ali Hürriyetoğlu | Hristo Tanev | Vanni Zavarella | Jakub Piskorski | Reyyan Yeniterzi | Osman Mutlu | Deniz Yuret | Aline Villavicencio
Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021)

This workshop is the fourth issue of a series of workshops on automatic extraction of socio-political events from news, organized by the Emerging Market Welfare Project, with the support of the Joint Research Centre of the European Commission and with contributions from many other prominent scholars in this field. The purpose of this series of workshops is to foster research and development of reliable, valid, robust, and practical solutions for automatically detecting descriptions of socio-political events, such as protests, riots, wars and armed conflicts, in text streams. This year workshop contributors make use of the state-of-the-art NLP technologies, such as Deep Learning, Word Embeddings and Transformers and cover a wide range of topics from text classification to news bias detection. Around 40 teams have registered and 15 teams contributed to three tasks that are i) multilingual protest news detection detection, ii) fine-grained classification of socio-political events, and iii) discovering Black Lives Matter protest events. The workshop also highlights two keynote and four invited talks about various aspects of creating event data sets and multi- and cross-lingual machine learning in few- and zero-shot settings.

pdf bib
Discovering Black Lives Matter Events in the United States: Shared Task 3, CASE 2021
Salvatore Giorgi | Vanni Zavarella | Hristo Tanev | Nicolas Stefanovitch | Sy Hwang | Hansi Hettiarachchi | Tharindu Ranasinghe | Vivek Kalyan | Paul Tan | Shaun Tan | Martin Andrews | Tiancheng Hu | Niklas Stoehr | Francesco Ignazio Re | Daniel Vegh | Dennis Atzenhofer | Brenda Curtis | Ali Hürriyetoğlu
Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021)

Evaluating the state-of-the-art event detection systems on determining spatio-temporal distribution of the events on the ground is performed unfrequently. But, the ability to both (1) extract events “in the wild” from text and (2) properly evaluate event detection systems has potential to support a wide variety of tasks such as monitoring the activity of socio-political movements, examining media coverage and public support of these movements, and informing policy decisions. Therefore, we study performance of the best event detection systems on detecting Black Lives Matter (BLM) events from tweets and news articles. The murder of George Floyd, an unarmed Black man, at the hands of police officers received global attention throughout the second half of 2020. Protests against police violence emerged worldwide and the BLM movement, which was once mostly regulated to the United States, was now seeing activity globally. This shared task asks participants to identify BLM related events from large unstructured data sources, using systems pretrained to extract socio-political events from text. We evaluate several metrics, accessing each system’s ability to identify protest events both temporally and spatially. Results show that identifying daily protest counts is an easier task than classifying spatial and temporal protest trends simultaneously, with maximum performance of 0.745 and 0.210 (Pearson r), respectively. Additionally, all baselines and participant systems suffered from low recall, with a maximum recall of 5.08.

2020

pdf bib
Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020
Ali Hürriyetoğlu | Erdem Yörük | Vanni Zavarella | Hristo Tanev
Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020

pdf bib
Automated Extraction of Socio-political Events from News (AESPEN): Workshop and Shared Task Report
Ali Hürriyetoğlu | Vanni Zavarella | Hristo Tanev | Erdem Yörük | Ali Safaya | Osman Mutlu
Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020

We describe our effort on automated extraction of socio-political events from news in the scope of a workshop and a shared task we organized at Language Resources and Evaluation Conference (LREC 2020). We believe the event extraction studies in computational linguistics and social and political sciences should further support each other in order to enable large scale socio-political event information collection across sources, countries, and languages. The event consists of regular research papers and a shared task, which is about event sentence coreference identification (ESCI), tracks. All submissions were reviewed by five members of the program committee. The workshop attracted research papers related to evaluation of machine learning methodologies, language resources, material conflict forecasting, and a shared task participation report in the scope of socio-political event information collection. It has shown us the volume and variety of both the data sources and event information collection approaches related to socio-political events and the need to fill the gap between automated text processing techniques and requirements of social and political sciences.

2018

pdf bib
On Training Classifiers for Linking Event Templates
Jakub Piskorski | Fredi Šarić | Vanni Zavarella | Martin Atkinson
Proceedings of the Workshop Events and Stories in the News 2018

The paper reports on exploring various machine learning techniques and a range of textual and meta-data features to train classifiers for linking related event templates automatically extracted from online news. With the best model using textual features only we achieved 94.7% (92.9%) F1 score on GOLD (SILVER) dataset. These figures were further improved to 98.6% (GOLD) and 97% (SILVER) F1 score by adding meta-data features, mainly thanks to the strong discriminatory power of automatically extracted geographical information related to events.

2017

pdf bib
On the Creation of a Security-Related Event Corpus
Martin Atkinson | Jakub Piskorski | Hristo Tanev | Vanni Zavarella
Proceedings of the Events and Stories in the News Workshop

This paper reports on an effort of creating a corpus of structured information on security-related events automatically extracted from on-line news, part of which has been manually curated. The main motivation behind this effort is to provide material to the NLP community working on event extraction that could be used both for training and evaluation purposes.

2014

pdf bib
Resource Creation and Evaluation for Multilingual Sentiment Analysis in Social Media Texts
Alexandra Balahur | Marco Turchi | Ralf Steinberger | Jose-Manuel Perea-Ortega | Guillaume Jacquet | Dilek Küçük | Vanni Zavarella | Adil El Ghali
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper presents an evaluation of the use of machine translation to obtain and employ data for training multilingual sentiment classifiers. We show that the use of machine translated data obtained similar results as the use of native-speaker translations of the same data. Additionally, our evaluations pinpoint to the fact that the use of multilingual data, including that obtained through machine translation, leads to improved results in sentiment classification. Finally, we show that the performance of the sentiment classifiers built on machine translated data can be improved using original data from the target language and that even a small amount of such texts can lead to significant growth in the classification performance.

pdf bib
Event Extraction for Balkan Languages
Vanni Zavarella | Dilek Küçük | Hristo Tanev | Ali Hürriyetoğlu
Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics

2013

pdf bib
FSS-TimEx for TempEval-3: Extracting Temporal Information from Text
Vanni Zavarella | Hristo Tanev
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

2011

pdf bib
Creating Sentiment Dictionaries via Triangulation
Josef Steinberger | Polina Lenkova | Mohamed Ebrahim | Maud Ehrmann | Ali Hurriyetoglu | Mijail Kabadjov | Ralf Steinberger | Hristo Tanev | Vanni Zavarella | Silvia Vázquez
Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2.011)

pdf bib
Pattern Learning for Event Extraction using Monolingual Statistical Machine Translation
Marco Turchi | Vanni Zavarella | Hristo Tanev
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011

2010

pdf bib
Sentiment Analysis in the News
Alexandra Balahur | Ralf Steinberger | Mijail Kabadjov | Vanni Zavarella | Erik van der Goot | Matina Halkia | Bruno Pouliquen | Jenya Belyaeva
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Recent years have brought a significant growth in the volume of research in sentiment analysis, mostly on highly subjective text types (movie or product reviews). The main difference these texts have with news articles is that their target is clearly defined and unique across the text. Following different annotation efforts and the analysis of the issues encountered, we realised that news opinion mining is different from that of other text types. We identified three subtasks that need to be addressed: definition of the target; separation of the good and bad news content from the good and bad sentiment expressed on the target; and analysis of clearly marked opinion that is expressed explicitly, not needing interpretation or the use of world knowledge. Furthermore, we distinguish three different possible views on newspaper articles ― author, reader and text, which have to be addressed differently at the time of analysing sentiment. Given these definitions, we present work on mining opinions about entities in English language news, in which we apply these concepts. Results showed that this idea is more appropriate in the context of news opinion mining and that the approaches taking this into consideration produce a better performance.

2008

pdf bib
Online-Monitoring of Security-Related Events
Martin Atkinson | Jakub Piskorski | Bruno Pouliquen | Ralf Steinberger | Hristo Tanev | Vanni Zavarella
Coling 2008: Companion volume: Demonstrations