Étienne Simon


2024

pdf bib
Socio-political Events of Conflict and Unrest: A Survey of Available Datasets
Helene Olsen | Étienne Simon | Erik Velldal | Lilja Øvrelid
Proceedings of the 7th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2024)

There is a large and growing body of literature on datasets created to facilitate the study of socio-political events of conflict and unrest. However, the datasets, and the approaches taken to create them, vary a lot depending on the type of research they are intended to support. For example, while scholars from natural language processing (NLP) tend to focus on annotating specific spans of text indicating various components of an event, scholars from the disciplines of political science and conflict studies tend to focus on creating databases that code an abstract but structured representation of the event, less tied to a specific source text.The survey presented in this paper aims to map out the current landscape of available event datasets within the domain of social and political conflict and unrest – both from the NLP and political science communities – offering a unified view of the work done across different disciplines.

2022

pdf bib
Fine-tuning and Sampling Strategies for Multimodal Role Labeling of Entities under Class Imbalance
Syrielle Montariol | Étienne Simon | Arij Riabi | Djamé Seddah
Proceedings of the Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situations

We propose our solution to the multimodal semantic role labeling task from the CONSTRAINT’22 workshop. The task aims at classifying entities in memes into classes such as “hero” and “villain”. We use several pre-trained multi-modal models to jointly encode the text and image of the memes, and implement three systems to classify the role of the entities. We propose dynamic sampling strategies to tackle the issue of class imbalance. Finally, we perform qualitative analysis on the representations of the entities.

2019

pdf bib
Unsupervised Information Extraction: Regularizing Discriminative Approaches with Relation Distribution Losses
Étienne Simon | Vincent Guigue | Benjamin Piwowarski
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Unsupervised relation extraction aims at extracting relations between entities in text. Previous unsupervised approaches are either generative or discriminative. In a supervised setting, discriminative approaches, such as deep neural network classifiers, have demonstrated substantial improvement. However, these models are hard to train without supervision, and the currently proposed solutions are unstable. To overcome this limitation, we introduce a skewness loss which encourages the classifier to predict a relation with confidence given a sentence, and a distribution distance loss enforcing that all relations are predicted in average. These losses improve the performance of discriminative based models, and enable us to train deep neural networks satisfactorily, surpassing current state of the art on three different datasets.