Joan Plepi


2023

pdf bib
Personalized Intended and Perceived Sarcasm Detection on Twitter
Joan Plepi | Magdalena Buski | Lucie Flek
Proceedings of the 3rd Workshop on Computational Linguistics for the Political and Social Sciences

2022

pdf bib
Understanding Interpersonal Conflict Types and their Impact on Perception Classification
Charles Welch | Joan Plepi | Béla Neuendorf | Lucie Flek
Proceedings of the Fifth Workshop on Natural Language Processing and Computational Social Science (NLP+CSS)

Studies on interpersonal conflict have a long history and contain many suggestions for conflict typology. We use this as the basis of a novel annotation scheme and release a new dataset of situations and conflict aspect annotations. We then build a classifier to predict whether someone will perceive the actions of one individual as right or wrong in a given situation. Our analyses include conflict aspects, but also generated clusters, which are human validated, and show differences in conflict content based on the relationship of participants to the author. Our findings have important implications for understanding conflict and social norms.

pdf bib
FACTOID: A New Dataset for Identifying Misinformation Spreaders and Political Bias
Flora Sakketou | Joan Plepi | Riccardo Cervero | Henri Jacques Geiss | Paolo Rosso | Lucie Flek
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Proactively identifying misinformation spreaders is an important step towards mitigating the impact of fake news on our society. In this paper, we introduce a new contemporary Reddit dataset for fake news spreader analysis, called FACTOID, monitoring political discussions on Reddit since the beginning of 2020. The dataset contains over 4K users with 3.4M Reddit posts, and includes, beyond the users’ binary labels, also their fine-grained credibility level (very low to very high) and their political bias strength (extreme right to extreme left). As far as we are aware, this is the first fake news spreader dataset that simultaneously captures both the long-term context of users’ historical posts and the interactions between them. To create the first benchmark on our data, we provide methods for identifying misinformation spreaders by utilizing the social connections between the users along with their psycho-linguistic features. We show that the users’ social interactions can, on their own, indicate misinformation spreading, while the psycho-linguistic features are mostly informative in non-neural classification settings. In a qualitative analysis we observe that detecting affective mental processes correlates negatively with right-biased users, and that the openness to experience factor is lower for those who spread fake news.

pdf bib
Temporal Graph Analysis of Misinformation Spreaders in Social Media
Joan Plepi | Flora Sakketou | Henri-Jacques Geiss | Lucie Flek
Proceedings of TextGraphs-16: Graph-based Methods for Natural Language Processing

Proactively identifying misinformation spreaders is an important step towards mitigating the impact of fake news on our society. Although the news domain is subject to rapid changes over time, the temporal dynamics of the spreaders’ language and network have not been explored yet. In this paper, we analyze the users’ time-evolving semantic similarities and social interactions and show that such patterns can, on their own, indicate misinformation spreading. Building on these observations, we propose a dynamic graph-based framework that leverages the dynamic nature of the users’ network for detecting fake news spreaders. We validate our design choice through qualitative analysis and demonstrate the contributions of our model’s components through a series of exploratory and ablative experiments on two datasets.

pdf bib
Unifying Data Perspectivism and Personalization: An Application to Social Norms
Joan Plepi | Béla Neuendorf | Lucie Flek | Charles Welch
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Instead of using a single ground truth for language processing tasks, several recent studies have examined how to represent and predict the labels of the set of annotators. However, often little or no information about annotators is known, or the set of annotators is small. In this work, we examine a corpus of social media posts about conflict from a set of 13k annotators and 210k judgements of social norms. We provide a novel experimental setup that applies personalization methods to the modeling of annotators and compare their effectiveness for predicting the perception of social norms. We further provide an analysis of performance across subsets of social situations that vary by the closeness of the relationship between parties in conflict, and assess where personalization helps the most.

2021

pdf bib
Conversational Question Answering over Knowledge Graphs with Transformer and Graph Attention Networks
Endri Kacupaj | Joan Plepi | Kuldeep Singh | Harsh Thakkar | Jens Lehmann | Maria Maleshkova
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

This paper addresses the task of (complex) conversational question answering over a knowledge graph. For this task, we propose LASAGNE (muLti-task semAntic parSing with trAnsformer and Graph atteNtion nEworks). It is the first approach, which employs a transformer architecture extended with Graph Attention Networks for multi-task neural semantic parsing. LASAGNE uses a transformer model for generating the base logical forms, while the Graph Attention model is used to exploit correlations between (entity) types and predicates to produce node representations. LASAGNE also includes a novel entity recognition module which detects, links, and ranks all relevant entities in the question context. We evaluate LASAGNE on a standard dataset for complex sequential question answering, on which it outperforms existing baselines averaged on all question types. Specifically, we show that LASAGNE improves the F1-score on eight out of ten question types; in some cases, the increase is more than 20% compared to state of the art (SotA).

pdf bib
Perceived and Intended Sarcasm Detection with Graph Attention Networks
Joan Plepi | Lucie Flek
Findings of the Association for Computational Linguistics: EMNLP 2021

Existing sarcasm detection systems focus on exploiting linguistic markers, context, or user-level priors. However, social studies suggest that the relationship between the author and the audience can be equally relevant for the sarcasm usage and interpretation. In this work, we propose a framework jointly leveraging (1) a user context from their historical tweets together with (2) the social information from a user’s neighborhood in an interaction graph, to contextualize the interpretation of the post. We distinguish between perceived and self-reported sarcasm identification. We use graph attention networks (GAT) over users and tweets in a conversation thread, combined with various dense user history representations. Apart from achieving state-of-the-art results on the recently published dataset of 19k Twitter users with 30K labeled tweets, adding 10M unlabeled tweets as context, our experiments indicate that the graph network contributes to interpreting the sarcastic intentions of the author more than to predicting the sarcasm perception by others.

pdf bib
Perceived and Intended Sarcasm Detection with Graph Attention Networks
Joan Plepi | Lucie Flek
Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021)

Existing sarcasm detection systems focus on exploiting linguistic markers, context, or user-level priors. However, social studies suggest that the relationship between the author and the audience can be equally relevant for the sarcasm usage and interpretation. In this work, we propose a framework jointly leveraging (1) a user context from their historical tweets together with (2) the social information from a user’s conversational neighborhood in an interaction graph, to contextualize the interpretation of the post. We use graph attention networks (GAT) over users and tweets in a conversation thread, combined with dense user history representations. Apart from achieving state-of-the-art results on the recently published dataset of 19k Twitter users with 30K labeled tweets, adding 10M unlabeled tweets as context, our results indicate that the model contributes to interpreting the sarcastic intentions of an author more than to predicting the sarcasm perception by others.