Federico Ruggeri


2024

pdf bib
Multimodal Fallacy Classification in Political Debates
Eleonora Mancini | Federico Ruggeri | Paolo Torroni
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers)

Recent advances in NLP suggest that some tasks, such as argument detection and relation classification, are better framed in a multimodal perspective. We propose multimodal argument mining for argumentative fallacy classification in political debates. To this end, we release the first corpus for multimodal fallacy classification. Our experiments show that the integration of the audio modality leads to superior classification performance. Our findings confirm that framing fallacy classification as a multimodal task is essential to capture paralinguistic aspects of fallacious arguments.

2023

pdf bib
A Dataset of Argumentative Dialogues on Scientific Papers
Federico Ruggeri | Mohsen Mesgar | Iryna Gurevych
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

With recent advances in question-answering models, various datasets have been collected to improve and study the effectiveness of these models on scientific texts. Questions and answers in these datasets explore a scientific paper by seeking factual information from the paper’s content. However, these datasets do not tackle the argumentative content of scientific papers, which is of huge importance in persuasiveness of a scientific discussion. We introduce ArgSciChat, a dataset of 41 argumentative dialogues between scientists on 20 NLP papers. The unique property of our dataset is that it includes both exploratory and argumentative questions and answers in a dialogue discourse on a scientific paper. Moreover, the size of ArgSciChat demonstrates the difficulties in collecting dialogues for specialized domains. Thus, our dataset is a challenging resource to evaluate dialogue agents in low-resource domains, in which collecting training data is costly. We annotate all sentences of dialogues in ArgSciChat and analyze them extensively. The results confirm that dialogues in ArgSciChat include exploratory and argumentative interactions. Furthermore, we use our dataset to fine-tune and evaluate a pre-trained document-grounded dialogue agent. The agent achieves a low performance on our dataset, motivating a need for dialogue agents with a capability to reason and argue about their answers. We publicly release ArgSciChat.

2022

pdf bib
Detecting Arguments in CJEU Decisions on Fiscal State Aid
Giulia Grundler | Piera Santin | Andrea Galassi | Federico Galli | Francesco Godano | Francesca Lagioia | Elena Palmieri | Federico Ruggeri | Giovanni Sartor | Paolo Torroni
Proceedings of the 9th Workshop on Argument Mining

The successful application of argument mining in the legal domain can dramatically impact many disciplines related to law. For this purpose, we present Demosthenes, a novel corpus for argument mining in legal documents, composed of 40 decisions of the Court of Justice of the European Union on matters of fiscal state aid. The annotation specifies three hierarchical levels of information: the argumentative elements, their types, and their argument schemes. In our experimental evaluation, we address 4 different classification tasks, combining advanced language models and traditional classifiers.

pdf bib
Multimodal Argument Mining: A Case Study in Political Debates
Eleonora Mancini | Federico Ruggeri | Andrea Galassi | Paolo Torroni
Proceedings of the 9th Workshop on Argument Mining

We propose a study on multimodal argument mining in the domain of political debates. We collate and extend existing corpora and provide an initial empirical study on multimodal architectures, with a special emphasis on input encoding methods. Our results provide interesting indications about future directions in this important domain.

pdf bib
Combining WordNet and Word Embeddings in Data Augmentation for Legal Texts
Sezen Perçin | Andrea Galassi | Francesca Lagioia | Federico Ruggeri | Piera Santin | Giovanni Sartor | Paolo Torroni
Proceedings of the Natural Legal Language Processing Workshop 2022

Creating balanced labeled textual corpora for complex tasks, like legal analysis, is a challenging and expensive process that often requires the collaboration of domain experts. To address this problem, we propose a data augmentation method based on the combination of GloVe word embeddings and the WordNet ontology. We present an example of application in the legal domain, specifically on decisions of the Court of Justice of the European Union.Our evaluation with human experts confirms that our method is more robust than the alternatives.

pdf bib
A Sentiment and Emotion Annotated Dataset for Bitcoin Price Forecasting Based on Reddit Posts
Pavlo Seroyizhko | Zhanel Zhexenova | Muhammad Zohaib Shafiq | Fabio Merizzi | Andrea Galassi | Federico Ruggeri
Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)

Cryptocurrencies have gained enormous momentum in finance and are nowadays commonly adopted as a medium of exchange for online payments. After recent events during which GameStop’s stocks were believed to be influenced by WallStreetBets subReddit, Reddit has become a very hot topic on the cryptocurrency market. The influence of public opinions on cryptocurrency price trends has inspired researchers on exploring solutions that integrate such information in crypto price change forecasting. A popular integration technique regards representing social media opinions via sentiment features. However, this research direction is still in its infancy, where a limited number of publicly available datasets with sentiment annotations exists. We propose a novel Bitcoin Reddit Sentiment Dataset, a ready-to-use dataset annotated with state-of-the-art sentiment and emotion recognition. The dataset contains pre-processed Reddit posts and comments about Bitcoin from several domain-related subReddits along with Bitcoin’s financial data. We evaluate several widely adopted neural architectures for crypto price change forecasting. Our results show controversial benefits of sentiment and emotion features advocating for more sophisticated social media integration techniques. We make our dataset publicly available for research.