Gabriele Picco


2023

pdf bib
Matching Pairs: Attributing Fine-Tuned Models to their Pre-Trained Large Language Models
Myles Foley | Ambrish Rawat | Taesung Lee | Yufang Hou | Gabriele Picco | Giulio Zizzo
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The wide applicability and adaptability of generative large language models (LLMs) has enabled their rapid adoption. While the pre-trained models can perform many tasks, such models are often fine-tuned to improve their performance on various downstream applications. However, this leads to issues over violation of model licenses, model theft, and copyright infringement. Moreover, recent advances show that generative technology is capable of producing harmful content which exacerbates the problems of accountability within model supply chains. Thus, we need a method to investigate how a model was trained or a piece of text was generated and what their pre-trained base model was. In this paper we take the first step to address this open problem by tracing back the origin of a given fine-tuned LLM to its corresponding pre-trained base model. We consider different knowledge levels and attribution strategies, and find that we can correctly trace back 8 out of the 10 fine tuned models with our best method.

pdf bib
Zshot: An Open-source Framework for Zero-Shot Named Entity Recognition and Relation Extraction
Gabriele Picco | Marcos Martinez Galindo | Alberto Purpura | Leopold Fuchs | Vanessa Lopez | Thanh Lam Hoang
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)

The Zero-Shot Learning (ZSL) task pertains to the identification of entities or relations in texts that were not seen during training. ZSL has emerged as a critical research area due to the scarcity of labeled data in specific domains, and its applications have grown significantly in recent years. With the advent of large pretrained language models, several novel methods have been proposed, resulting in substantial improvements in ZSL performance. There is a growing demand, both in the research community and industry, for a comprehensive ZSL framework that facilitates the development and accessibility of the latest methods and pretrained models. In this study, we propose a novel ZSL framework called Zshot that aims to address the aforementioned challenges. Our primary objective is to provide a platform that allows researchers to compare different state-of-the-art ZSL methods with standard benchmark datasets. Additionally, we have designed our framework to support the industry with readily available APIs for production under the standard SpaCy NLP pipeline. Our API is extendible and evaluable, moreover, we include numerous enhancements such as boosting the accuracy with pipeline ensembling and visualization utilities available as a SpaCy extension.

2021

pdf bib
Towards Protecting Vital Healthcare Programs by Extracting Actionable Knowledge from Policy
Vanessa Lopez | Nagesh Yadav | Gabriele Picco | Inge Vejsbjerg | Eoin Carrol | Seamus Brady | Marco Luca Sbodio | Lam Thanh Hoang | Miao Wei | John Segrave
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
Neural Unification for Logic Reasoning over Natural Language
Gabriele Picco | Thanh Lam Hoang | Marco Luca Sbodio | Vanessa Lopez
Findings of the Association for Computational Linguistics: EMNLP 2021

Automated Theorem Proving (ATP) deals with the development of computer programs being able to show that some conjectures (queries) are a logical consequence of a set of axioms (facts and rules). There exists several successful ATPs where conjectures and axioms are formally provided (e.g. formalised as First Order Logic formulas). Recent approaches, such as Clark et al., have proposed transformer-based architectures for deriving conjectures given axioms expressed in natural language (English). The conjecture is verified through a binary text classifier, where the transformers model is trained to predict the truth value of a conjecture given the axioms. The RuleTaker approach of Clark et al. achieves appealing results both in terms of accuracy and in the ability to generalize, showing that when the model is trained with deep enough queries (at least 3 inference steps), the transformers are able to correctly answer the majority of queries (97.6%) that require up to 5 inference steps. In this work we propose a new architecture, namely the Neural Unifier, and a relative training procedure, which achieves state-of-the-art results in term of generalisation, showing that mimicking a well-known inference procedure, the backward chaining, it is possible to answer deep queries even when the model is trained only on shallow ones. The approach is demonstrated in experiments using a diverse set of benchmark data and the source code is released to the research community for reproducibility.