Hristo Tanev

Also published as: Hristo Tannev


2021

pdf bib
Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021): Workshop and Shared Task Report
Ali Hürriyetoğlu | Hristo Tanev | Vanni Zavarella | Jakub Piskorski | Reyyan Yeniterzi | Osman Mutlu | Deniz Yuret | Aline Villavicencio
Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021)

This workshop is the fourth issue of a series of workshops on automatic extraction of socio-political events from news, organized by the Emerging Market Welfare Project, with the support of the Joint Research Centre of the European Commission and with contributions from many other prominent scholars in this field. The purpose of this series of workshops is to foster research and development of reliable, valid, robust, and practical solutions for automatically detecting descriptions of socio-political events, such as protests, riots, wars and armed conflicts, in text streams. This year workshop contributors make use of the state-of-the-art NLP technologies, such as Deep Learning, Word Embeddings and Transformers and cover a wide range of topics from text classification to news bias detection. Around 40 teams have registered and 15 teams contributed to three tasks that are i) multilingual protest news detection detection, ii) fine-grained classification of socio-political events, and iii) discovering Black Lives Matter protest events. The workshop also highlights two keynote and four invited talks about various aspects of creating event data sets and multi- and cross-lingual machine learning in few- and zero-shot settings.

pdf bib
Discovering Black Lives Matter Events in the United States: Shared Task 3, CASE 2021
Salvatore Giorgi | Vanni Zavarella | Hristo Tanev | Nicolas Stefanovitch | Sy Hwang | Hansi Hettiarachchi | Tharindu Ranasinghe | Vivek Kalyan | Paul Tan | Shaun Tan | Martin Andrews | Tiancheng Hu | Niklas Stoehr | Francesco Ignazio Re | Daniel Vegh | Dennis Atzenhofer | Brenda Curtis | Ali Hürriyetoğlu
Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021)

Evaluating the state-of-the-art event detection systems on determining spatio-temporal distribution of the events on the ground is performed unfrequently. But, the ability to both (1) extract events “in the wild” from text and (2) properly evaluate event detection systems has potential to support a wide variety of tasks such as monitoring the activity of socio-political movements, examining media coverage and public support of these movements, and informing policy decisions. Therefore, we study performance of the best event detection systems on detecting Black Lives Matter (BLM) events from tweets and news articles. The murder of George Floyd, an unarmed Black man, at the hands of police officers received global attention throughout the second half of 2020. Protests against police violence emerged worldwide and the BLM movement, which was once mostly regulated to the United States, was now seeing activity globally. This shared task asks participants to identify BLM related events from large unstructured data sources, using systems pretrained to extract socio-political events from text. We evaluate several metrics, accessing each system’s ability to identify protest events both temporally and spatially. Results show that identifying daily protest counts is an easier task than classifying spatial and temporal protest trends simultaneously, with maximum performance of 0.745 and 0.210 (Pearson r), respectively. Additionally, all baselines and participant systems suffered from low recall, with a maximum recall of 5.08.

2020

pdf bib
Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020
Ali Hürriyetoğlu | Erdem Yörük | Vanni Zavarella | Hristo Tanev
Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020

pdf bib
Automated Extraction of Socio-political Events from News (AESPEN): Workshop and Shared Task Report
Ali Hürriyetoğlu | Vanni Zavarella | Hristo Tanev | Erdem Yörük | Ali Safaya | Osman Mutlu
Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020

We describe our effort on automated extraction of socio-political events from news in the scope of a workshop and a shared task we organized at Language Resources and Evaluation Conference (LREC 2020). We believe the event extraction studies in computational linguistics and social and political sciences should further support each other in order to enable large scale socio-political event information collection across sources, countries, and languages. The event consists of regular research papers and a shared task, which is about event sentence coreference identification (ESCI), tracks. All submissions were reviewed by five members of the program committee. The workshop attracted research papers related to evaluation of machine learning methodologies, language resources, material conflict forecasting, and a shared task participation report in the scope of socio-political event information collection. It has shown us the volume and variety of both the data sources and event information collection approaches related to socio-political events and the need to fill the gap between automated text processing techniques and requirements of social and political sciences.

2019

pdf bib
JRC TMA-CC: Slavic Named Entity Recognition and Linking. Participation in the BSNLP-2019 shared task
Guillaume Jacquet | Jakub Piskorski | Hristo Tanev | Ralf Steinberger
Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing

We report on the participation of the JRC Text Mining and Analysis Competence Centre (TMA-CC) in the BSNLP-2019 Shared Task, which focuses on named-entity recognition, lemmatisation and cross-lingual linking. We propose a hybrid system combining a rule-based approach and light ML techniques. We use multilingual lexical resources such as JRC-NAMES and BABELNET together with a named entity guesser to recognise names. In a second step, we combine known names with wild cards to increase recognition recall by also capturing inflection variants. In a third step, we increase precision by filtering these name candidates with automatically learnt inflection patterns derived from name occurrences in large news article collections. Our major requirement is to achieve high precision. We achieved an average of 65% F-measure with 93% precision on the four languages.

2017

pdf bib
Large-scale news entity sentiment analysis
Ralf Steinberger | Stefanie Hegele | Hristo Tanev | Leonida Della Rocca
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

We work on detecting positive or negative sentiment towards named entities in very large volumes of news articles. The aim is to monitor changes over time, as well as to work towards media bias detection by com-paring differences across news sources and countries. With view to applying the same method to dozens of languages, we use lin-guistically light-weight methods: searching for positive and negative terms in bags of words around entity mentions (also consid-ering negation). Evaluation results are good and better than a third-party baseline sys-tem, but precision is not sufficiently high to display the results publicly in our multilin-gual news analysis system Europe Media Monitor (EMM). In this paper, we focus on describing our effort to improve the English language results by avoiding the biggest sources of errors. We also present new work on using a syntactic parser to identify safe opinion recognition rules, such as predica-tive structures in which sentiment words di-rectly refer to an entity. The precision of this method is good, but recall is very low.

pdf bib
On the Creation of a Security-Related Event Corpus
Martin Atkinson | Jakub Piskorski | Hristo Tanev | Vanni Zavarella
Proceedings of the Events and Stories in the News Workshop

This paper reports on an effort of creating a corpus of structured information on security-related events automatically extracted from on-line news, part of which has been manually curated. The main motivation behind this effort is to provide material to the NLP community working on event extraction that could be used both for training and evaluation purposes.

2016

pdf bib
Detecting Implicit Expressions of Affect from Text using Semantic Knowledge on Common Concept Properties
Alexandra Balahur | Hristo Tanev
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Emotions are an important part of the human experience. They are responsible for the adaptation and integration in the environment, offering, most of the time together with the cognitive system, the appropriate responses to stimuli in the environment. As such, they are an important component in decision-making processes. In today’s society, the avalanche of stimuli present in the environment (physical or virtual) makes people more prone to respond to stronger affective stimuli (i.e., those that are related to their basic needs and motivations ― survival, food, shelter, etc.). In media reporting, this is translated in the use of arguments (factual data) that are known to trigger specific (strong, affective) behavioural reactions from the readers. This paper describes initial efforts to detect such arguments from text, based on the properties of concepts. The final system able to retrieve and label this type of data from the news in traditional and social platforms is intended to be integrated Europe Media Monitor family of applications to detect texts that trigger certain (especially negative) reactions from the public, with consequences on citizen safety and security.

pdf bib
Deftor at SemEval-2016 Task 14: Taxonomy enrichment using definition vectors
Hristo Tanev | Agata Rotondi
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
The 5th Workshop on Balto-Slavic Natural Language Processing
Jakub Piskorski | Lidia Pivovarova | Jan Šnajder | Hristo Tanev | Roman Yangarber
The 5th Workshop on Balto-Slavic Natural Language Processing

pdf bib
Towards Multilingual Event Extraction Evaluation: A Case Study for the Czech Language
Josef Steinberger | Hristo Tanev
Proceedings of the International Conference Recent Advances in Natural Language Processing

2014

pdf bib
Challenges in Creating a Multilingual Sentiment Analysis Application for Social Media Mining
Alexandra Balahur | Hristo Tanev | Erik van der Goot
Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

pdf bib
Event Extraction for Balkan Languages
Vanni Zavarella | Dilek Küçük | Hristo Tanev | Ali Hürriyetoğlu
Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics

2013

pdf bib
Proceedings of the 4th Biennial International Workshop on Balto-Slavic Natural Language Processing
Jakub Piskorski | Lidia Pivovarova | Hristo Tanev | Roman Yangarber
Proceedings of the 4th Biennial International Workshop on Balto-Slavic Natural Language Processing

pdf bib
Semi-automatic Acquisition of Lexical Resources and Grammars for Event Extraction in Bulgarian and Czech
Hristo Tanev | Josef Steinberger
Proceedings of the 4th Biennial International Workshop on Balto-Slavic Natural Language Processing

pdf bib
Acronym recognition and processing in 22 languages
Maud Ehrmann | Leonida Della Rocca | Ralf Steinberger | Hristo Tannev
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013

pdf bib
Detecting Event-Related Links and Sentiments from Social Media Texts
Alexandra Balahur | Hristo Tanev
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations

pdf bib
FSS-TimEx for TempEval-3: Extracting Temporal Information from Text
Vanni Zavarella | Hristo Tanev
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

2011

pdf bib
Creating Sentiment Dictionaries via Triangulation
Josef Steinberger | Polina Lenkova | Mohamed Ebrahim | Maud Ehrmann | Ali Hurriyetoglu | Mijail Kabadjov | Ralf Steinberger | Hristo Tanev | Vanni Zavarella | Silvia Vázquez
Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2.011)

pdf bib
Pattern Learning for Event Extraction using Monolingual Statistical Machine Translation
Marco Turchi | Vanni Zavarella | Hristo Tanev
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011

2008

pdf bib
Online-Monitoring of Security-Related Events
Martin Atkinson | Jakub Piskorski | Bruno Pouliquen | Ralf Steinberger | Hristo Tanev | Vanni Zavarella
Coling 2008: Companion volume: Demonstrations

2007

pdf bib
Proceedings of the Workshop on Balto-Slavonic Natural Language Processing
Jakub Piskorski | Hristo Tanev
Proceedings of the Workshop on Balto-Slavonic Natural Language Processing

2006

pdf bib
Weakly Supervised Approaches for Ontology Population
Hristo Tanev | Bernardo Magnini
11th Conference of the European Chapter of the Association for Computational Linguistics

2004

pdf bib
Multilingual Pattern Libraries for Question Answering: a Case Study for Definition Questions
Hristo Tanev | Milen Kouylekov | Matteo Negri | Bonaventura Coppola | Bernardo Magnini
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
Scaling Web-based Acquisition of Entailment Relations
Idan Szpektor | Hristo Tanev | Ido Dagan | Bonaventura Coppola
Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing

2002

pdf bib
A WordNet-Based Approach to Named Entites Recognition
Bernardo Magnini | Matteo Negri | Roberto Prevete | Hristo Tanev
COLING-02: SEMANET: Building and Using Semantic Networks

pdf bib
Is It the Right Answer? Exploiting Web Redundancy for Answer Validation
Bernardo Magnini | Matteo Negri | Roberto Prevete | Hristo Tanev
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics

pdf bib
Towards Automatic Evaluation of Question/Answering Systems
Bernardo Magnini | Matteo Negri | Roberto Prevete | Hristo Tanev
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf bib
Shallow Language Processing Architecture for Bulgarian
Hristo Tanev | Ruslan Mitkov
COLING 2002: The 19th International Conference on Computational Linguistics

2000

pdf bib
LINGUA: a robust architecture for text processing and anaphora resolution in Bulgarian
Hristo Tanev | Ruslan Mitkov
Proceedings of the International Conference on Machine Translation and Multilingual Applications in the new Millennium: MT 2000