Raffaele Manna

2024

DIMMI - Drug InforMation Mining in Italian: A CALAMITA Challenge
Raffaele Manna | Maria Pia Di Buono | Luca Giordano
Proceedings of the Tenth Italian Conference on Computational Linguistics (CLiC-it 2024)

Patients’ knowledge about drugs and medications is crucial as it allows them to administer them safely. This knowledgefrequently comes from written prescriptions, patient information leaflets (PILs), or from reading drug Web pages. DIMMI(Drug InforMation Mining in Italian) is a challenge aiming at evaluating the proficiency of Large Language Models in extractingdrug-specific information from PILs. The challenge seeks to advance the understanding of effectiveness in processing complexmedical information in Italian, and to enhance drug information extraction and pharmacovigilance efforts. Participants areprovided with a dataset of 600 Italian PILs and the objective is to develop models capable of accurately answering specificquestions related to drug dosage, usage, side effects, drug-drug interactions. The challenge should be approached as aninformation extraction task through a zero-shot mode, purely based on the model pre-existing knowledge and understandingor through in-context learning (Retrieval-Augmented Generation (RAG) or few-shot mode). The answers generated by themodels will be compared against the gold standard (GS), created to establish a reliable, accurate, and a comprehensive setof answers against which participant submissions can be evaluated. For each drug and each information category, the GScontains the correct information extracted from the leaflets through a manual annotation.

pdf bib abs

Riddle Me This: Evaluating Large Language Models in Solving Word-Based Games
Raffaele Manna | Maria Pia di Buono | Johanna Monti
Proceedings of the 10th Workshop on Games and Natural Language Processing @ LREC-COLING 2024

In this contribution, we examine the proficiency of Large Language Models (LLMs) in solving the linguistic game “La Ghigliottina,” the final game of the popular Italian TV quiz show “L’Eredità”. This game is particularly challenging as it requires LLMs to engage in semantic inference reasoning for identifying the solutions of the game. Our experiment draws inspiration from Ghigliottin-AI, a task of EVALITA 2020, an evaluation campaign focusing on Natural Language Processing (NLP) and speech tools designed for the Italian language. To benchmark our experiment, we use the results of the most successful artificial player in this task, namely Il Mago della Ghigliottina. The paper describes the experimental setting and the results which show that LLMs perform poorly.

2022

pdf bib abs

Assessing the Quality of an Italian Crowdsourced Idiom Corpus:the Dodiom Experiment
Giuseppina Morza | Raffaele Manna | Johanna Monti
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This paper describes how idiom-related language resources, collected through a crowdsourcing experiment carried out by means of Dodiom, a Game-with-a-purpose, have been analysed by language experts. The paper focuses on the criteria adopted for the data annotation and evaluation process. The main scope of this project is, indeed, the evaluation of the quality of the linguistic data obtained through a crowdsourcing project, namely to assess if the data provided and evaluated by the players who joined the game are actually considered of good quality by the language experts. Finally, results of the annotation and evaluation processes as well as future work are presented.

pdf bib

Proceedings of the Second International Workshop on Resources and Techniques for User Information in Abusive Language Analysis
Johanna Monti | Valerio Basile | Maria Pia Di Buono | Raffaele Manna | Antonio Pascucci | Sara Tonelli
Proceedings of the Second International Workshop on Resources and Techniques for User Information in Abusive Language Analysis

2020

pdf bib

The Archaeo-Term Project: Multilingual Terminology in Archaeology
Giulia Speranza | Raffaele Manna | Maria Pia Di Buono | Johanna Monti
Proceedings of the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020)

pdf bib abs

The Role of Computational Stylometry in Identifying (Misogynistic) Aggression in English Social Media Texts
Antonio Pascucci | Raffaele Manna | Vincenzo Masucci | Johanna Monti
Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying

In this paper, we describe UniOr_ExpSys team participation in TRAC-2 (Trolling, Aggression and Cyberbullying) shared task, a workshop organized as part of LREC 2020. TRAC-2 shared task is organized in two sub-tasks: Aggression Identification (a 3-way classification between “Overtly Aggressive”, “Covertly Aggressive” and “Non-aggressive” text data) and Misogynistic Aggression Identification (a binary classifier for classifying the texts as “gendered” or “non-gendered”). Our approach is based on linguistic rules, stylistic features extraction through stylometric analysis and Sequential Minimal Optimization algorithm in building the two classifiers.

pdf bib

Monitoring Social Media to Identify Environmental Crimes through NLP. A preliminary study
Raffaele Manna | Antonio Pascucci | Wanda Punzi Zarino | Vincenzo Simoniello | Johanna Monti
Proceedings of the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020)

pdf bib abs

UNIOR NLP at MWSA Task - GlobaLex 2020: Siamese LSTM with Attention for Word Sense Alignment
Raffaele Manna | Giulia Speranza | Maria Pia di Buono | Johanna Monti
Proceedings of the 2020 Globalex Workshop on Linked Lexicography

In this paper we describe the system submitted to the ELEXIS Monolingual Word Sense Alignment Task. We test different systems,which are two types of LSTMs and a system based on a pretrained Bidirectional Encoder Representations from Transformers (BERT)model, to solve the task. LSTM models use fastText pre-trained word vectors features with different settings. For training the models,we did not combine external data with the dataset provided for the task. We select a sub-set of languages among the proposed ones,namely a set of Romance languages, i.e., Italian, Spanish, Portuguese, together with English and Dutch. The Siamese LSTM withattention and PoS tagging (LSTM-A) performed better than the other two systems, achieving a 5-Class Accuracy score of 0.844 in theOverall Results, ranking the first position among five teams.

pdf bib abs

Is this hotel review truthful or deceptive? A platform for disinformation detection through computational stylometry
Antonio Pascucci | Raffaele Manna | Ciro Caterino | Vincenzo Masucci | Johanna Monti
Proceedings for the First International Workshop on Social Threats in Online Conversations: Understanding and Management

In this paper, we present a web service platform for disinformation detection in hotel reviews written in English. The platform relies on a hybrid approach of computational stylometry techniques, machine learning and linguistic rules written using COGITO, Expert System Corp.’s semantic intelligence software thanks to which it is possible to analyze texts and extract all their characteristics. We carried out a research experiment on the Deceptive Opinion Spam corpus, a balanced corpus composed of 1,600 hotel reviews of 20 Chicago hotels split into four datasets: positive truthful, negative truthful, positive deceptive and negative deceptive reviews. We investigated four different classifiers and we detected that Simple Logistic is the most performing algorithm for this type of classification.

pdf bib

Proceedings of the Workshop on Resources and Techniques for User and Author Profiling in Abusive Language
Johanna Monti | Valerio Basile | Maria Pia Di Buono | Raffaele Manna | Antonio Pascucci | Sara Tonelli
Proceedings of the Workshop on Resources and Techniques for User and Author Profiling in Abusive Language

Raffaele Manna

2024

2022

2020

2019

Co-authors

Venues