Nicolas Stefanovitch


pdf bib
FRAPPE: FRAming, Persuasion, and Propaganda Explorer
Ahmed Sajwani | Alaa El Setohy | Ali Mekky | Diana Turmakhan | Lara Hassan | Mohamed El Zeftawy | Omar El Herraoui | Osama Afzal | Qisheng Liao | Tarek Mahmoud | Zain Muhammad Mujahid | Muhammad Umar Salman | Muhammad Arslan Manzoor | Massa Baali | Jakub Piskorski | Nicolas Stefanovitch | Giovanni Da San Martino | Preslav Nakov
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

The abundance of news sources and the urgent demand for reliable information have led to serious concerns about the threat of misleading information. In this paper, we present FRAPPE, a FRAming, Persuasion, and Propaganda Explorer system. FRAPPE goes beyond conventional news analysis of articles and unveils the intricate linguistic techniques used to shape readers’ opinions and emotions. Our system allows users not only to analyze individual articles for their genre, framings, and use of persuasion techniques, but also to draw comparisons between the strategies of persuasion and framing adopted by a diverse pool of news outlets and countries across multiple languages for different topics, thus providing a comprehensive understanding of how information is presented and manipulated. FRAPPE is publicly accessible at and a video explaining our system is available at


pdf bib
Holistic Inter-Annotator Agreement and Corpus Coherence Estimation in a Large-scale Multilingual Annotation Campaign
Nicolas Stefanovitch | Jakub Piskorski
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

In this paper we report on the complexity of persuasion technique annotation in the context of a large multilingual annotation campaign involving 6 languages and approximately 40 annotators. We highlight the techniques that appear to be difficult for humans to annotate and elaborate on our findings on the causes of this phenomenon. We introduce Holistic IAA, a new word embedding-based annotator agreement metric and we report on various experiments using this metric and its correlation with the traditional Inter Annotator Agreement (IAA) metrics. However, given somewhat limited and loose interaction between annotators, i.e., only a few annotators annotate the same document subsets, we try to devise a way to assess the coherence of the entire dataset and strive to find a good proxy for IAA between annotators tasked to annotate different documents and in different languages, for which classical IAA metrics can not be applied.

pdf bib
Detecting and Geocoding Battle Events from Social Media Messages on the Russo-Ukrainian War: Shared Task 2, CASE 2023
Hristo Tanev | Nicolas Stefanovitch | Andrew Halterman | Onur Uca | Vanni Zavarella | Ali Hurriyetoglu | Bertrand De Longueville | Leonida Della Rocca
Proceedings of the 6th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text

The purpose of the shared task 2 at the Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE) 2023 workshop was to test the abilities of the participating models and systems to detect and geocode armed conflicts events in social media messages from Telegram channels reporting on the Russo Ukrainian war. The evaluation followed an approach which was introduced in CASE 2021 (Giorgi et al., 2021): For each system we consider the correlation of the spatio-temporal distribution of its detected events and the events identified for the same period in the ACLED (Armed Conflict Location and Event Data Project) database (Raleigh et al., 2010). We use ACLED for the ground truth, since it is a well established standard in the field of event extraction and political trend analysis, which relies on human annotators for the encoding of security events using a fine grained taxonomy. Two systems participated in this shared task, we report in this paper on both the shared task and the participating systems.

pdf bib
Multilingual Multifaceted Understanding of Online News in Terms of Genre, Framing, and Persuasion Techniques
Jakub Piskorski | Nicolas Stefanovitch | Nikolaos Nikolaidis | Giovanni Da San Martino | Preslav Nakov
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We present a new multilingual multifacet dataset of news articles, each annotated for genre (objective news reporting vs. opinion vs. satire), framing (what key aspects are highlighted), and persuasion techniques (logical fallacies, emotional appeals, ad hominem attacks, etc.). The persuasion techniques are annotated at the span level, using a taxonomy of 23 fine-grained techniques grouped into 6 coarse categories. The dataset contains 1,612 news articles covering recent news on current topics of public interest in six European languages (English, French, German, Italian, Polish, and Russian), with more than 37k annotated spans of persuasion techniques. We describe the dataset and the annotation process, and we report the evaluation results of multilabel classification experiments using state-of-the-art multilingual transformers at different levels of granularity: token-level, sentence-level, paragraph-level, and document-level.

pdf bib
TeamEC at SemEval-2023 Task 4: Transformers vs. Low-Resource Dictionaries, Expert Dictionary vs. Learned Dictionary
Nicolas Stefanovitch | Bertrand De Longueville | Mario Scharfbillig
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This paper describes the system we used to participate in the shared task, as well as additional experiments beyond the scope of the shared task, but using its data. Our primary goal is to compare the effectiveness of transformers model compared to low-resource dictionaries. Secondly, we compare the difference in performance of a learned dictionary and of a dictionary designed by experts in the field of values. Our findings surprisingly show that transformers perform on par with a dictionary containing less than 1k words, when evaluated with 19 fine-grained categories, and only outperform a dictionary-based approach in a coarse setting with 10 categories. Interestingly, the expert dictionary has a precision on par with the learned one, while its recall is clearly lower, potentially an indication of overfitting of topics to values in the shared task’s dataset. Our findings should be of interest to both the NLP and Value scientific communities on the use of automated approaches for value classification

pdf bib
SemEval-2023 Task 3: Detecting the Category, the Framing, and the Persuasion Techniques in Online News in a Multi-lingual Setup
Jakub Piskorski | Nicolas Stefanovitch | Giovanni Da San Martino | Preslav Nakov
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

We describe SemEval-2023 task 3 on Detecting the Category, the Framing, and the Persuasion Techniques in Online News in a Multilingual Setup: the dataset, the task organization process, the evaluation setup, the results, and the participating systems. The task focused on news articles in nine languages (six known to the participants upfront: English, French, German, Italian, Polish, and Russian), and three additional ones revealed to the participants at the testing phase: Spanish, Greek, and Georgian). The task featured three subtasks: (1) determining the genre of the article (opinion, reporting, or satire), (2) identifying one or more frames used in an article from a pool of 14 generic frames, and (3) identify the persuasion techniques used in each paragraph of the article, using a taxonomy of 23 persuasion techniques. This was a very popular task: a total of 181 teams registered to participate, and 41 eventually made an official submission on the test set.

pdf bib
On Experiments of Detecting Persuasion Techniques in Polish and Russian Online News: Preliminary Study
Nikolaos Nikolaidis | Nicolas Stefanovitch | Jakub Piskorski
Proceedings of the 9th Workshop on Slavic Natural Language Processing 2023 (SlavicNLP 2023)

This paper reports on the results of preliminary experiments on the detection of persuasion techniques in online news in Polish and Russian, using a taxonomy of 23 persuasion techniques. The evaluation addresses different aspects, namely, the granularity of the persuasion technique category, i.e., coarse- (6 labels) versus fine-grained (23 labels), and the focus of the classification, i.e., at which level the labels are detected (subword, sentence, or paragraph). We compare the performance of mono- verus multi-lingual-trained state-of-the-art transformed-based models in this context.


pdf bib
Resources and Experiments on Sentiment Classification for Georgian
Nicolas Stefanovitch | Jakub Piskorski | Sopho Kharazi
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This paper presents, to the best of our knowledge, the first ever publicly available annotated dataset for sentiment classification and semantic polarity dictionary for Georgian. The characteristics of these resources and the process of their creation are described in detail. The results of various experiments on the performance of both lexicon- and machine learning-based models for Georgian sentiment classification are also reported. Both 3-label (positive, neutral, negative) and 4-label settings (same labels + mixed) are considered. The machine learning models explored include, i.a., logistic regression, SVMs, and transformed-based models. We also explore transfer learning- and translation-based (to a well-supported language) approaches. The obtained results for Georgian are on par with the state-of-the-art results in sentiment classification for well studied languages when using training data of comparable size.

pdf bib
Team TMA at SemEval-2022 Task 8: Lightweight and Language-Agnostic News Similarity Classifier
Nicolas Stefanovitch
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

We present our contribution to the SemEval 22 Share Task 8: Multilingual news article similarity. The approach is lightweight and language-agnostic, it is based on the computation of several lexicographic and embedding-based features, and the use of a simple ML approach: random forests. In a notable departure from the task formulation, which is a ranking task, we tackled this task as a classification one. We present a detailed analysis of the behaviour of our system under different settings.

pdf bib
Recovering Text from Endangered Languages Corrupted PDF documents
Nicolas Stefanovitch
Proceedings of the Fifth Workshop on the Use of Computational Methods in the Study of Endangered Languages

In this paper we present an approach to efficiently recover texts from corrupted documents of endangered languages. Textual resources for such languages are scarce, and sometimes the few available resources are corrupted PDF documents. Endangered languages are not supported by standard tools and present even the additional difficulties of not possessing any corpus over which to train language models to assist with the recovery. The approach presented is able to fully recover born digital PDF documents with minimal effort, thereby helping the preservation effort of endangered languages, by extending the range of documents usable for corpus building.


pdf bib
Fine-grained Event Classification in News-like Text Snippets - Shared Task 2, CASE 2021
Jacek Haneczok | Guillaume Jacquet | Jakub Piskorski | Nicolas Stefanovitch
Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021)

This paper describes the Shared Task on Fine-grained Event Classification in News-like Text Snippets. The Shared Task is divided into three sub-tasks: (a) classification of text snippets reporting socio-political events (25 classes) for which vast amount of training data exists, although exhibiting different structure and style vis-a-vis test data, (b) enhancement to a generalized zero-shot learning problem, where 3 additional event types were introduced in advance, but without any training data (‘unseen’ classes), and (c) further extension, which introduced 2 additional event types, announced shortly prior to the evaluation phase. The reported Shared Task focuses on classification of events in English texts and is organized as part of the Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021), co-located with the ACL-IJCNLP 2021 Conference. Four teams participated in the task. Best performing systems for the three aforementioned sub-tasks achieved 83.9%, 79.7% and 77.1% weighted F1 scores respectively.

pdf bib
Discovering Black Lives Matter Events in the United States: Shared Task 3, CASE 2021
Salvatore Giorgi | Vanni Zavarella | Hristo Tanev | Nicolas Stefanovitch | Sy Hwang | Hansi Hettiarachchi | Tharindu Ranasinghe | Vivek Kalyan | Paul Tan | Shaun Tan | Martin Andrews | Tiancheng Hu | Niklas Stoehr | Francesco Ignazio Re | Daniel Vegh | Dennis Atzenhofer | Brenda Curtis | Ali Hürriyetoğlu
Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021)

Evaluating the state-of-the-art event detection systems on determining spatio-temporal distribution of the events on the ground is performed unfrequently. But, the ability to both (1) extract events “in the wild” from text and (2) properly evaluate event detection systems has potential to support a wide variety of tasks such as monitoring the activity of socio-political movements, examining media coverage and public support of these movements, and informing policy decisions. Therefore, we study performance of the best event detection systems on detecting Black Lives Matter (BLM) events from tweets and news articles. The murder of George Floyd, an unarmed Black man, at the hands of police officers received global attention throughout the second half of 2020. Protests against police violence emerged worldwide and the BLM movement, which was once mostly regulated to the United States, was now seeing activity globally. This shared task asks participants to identify BLM related events from large unstructured data sources, using systems pretrained to extract socio-political events from text. We evaluate several metrics, accessing each system’s ability to identify protest events both temporally and spatially. Results show that identifying daily protest counts is an easier task than classifying spatial and temporal protest trends simultaneously, with maximum performance of 0.745 and 0.210 (Pearson r), respectively. Additionally, all baselines and participant systems suffered from low recall, with a maximum recall of 5.08.

pdf bib
Exploring Linguistically-Lightweight Keyword Extraction Techniques for Indexing News Articles in a Multilingual Set-up
Jakub Piskorski | Nicolas Stefanovitch | Guillaume Jacquet | Aldo Podavini
Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation

This paper presents a study of state-of-the-art unsupervised and linguistically unsophisticated keyword extraction algorithms, based on statistic-, graph-, and embedding-based approaches, including, i.a., Total Keyword Frequency, TF-IDF, RAKE, KPMiner, YAKE, KeyBERT, and variants of TextRank-based keyword extraction algorithms. The study was motivated by the need to select the most appropriate technique to extract keywords for indexing news articles in a real-world large-scale news analysis engine. The algorithms were evaluated on a corpus of circa 330 news articles in 7 languages. The overall best F1 scores for all languages on average were obtained using a combination of the recently introduced YAKE algorithm and KPMiner (20.1%, 46.6% and 47.2% for exact, partial and fuzzy matching resp.).