Giuseppe Attardi

2025

Dataground at SemEval-2025 Task 8: Small LLMs and Preference Optimization for Tabular QA
Giuseppe Attardi | Andrea Nelson Mauro | Daniele Sartiano
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

We present our submission to SemEval 2025 Task 8: Question Answering on Tabular Data, which challenges participants to develop systems capable of answering natural language questions on real-world tabular datasets. Our approach aims at generating Pandas code that can be run on such datasets to produce the desired answer. The approach consists in fine-tuning a Small Language Model (SLM) through Preference Optimization on both positive and negative examples generated by a teacher model.A base SLM is first elicited to produce the code to compute the answer to a question through a Chain of Thought (CoT) prompt. We performed extensive testing on the DataBench development set, exploring a variety of prompts, eventually settling on a detailed instruction prompt, followed by two-shot examples. Due to hardware constraints, the base model was an SLM with ${leq}$ 8 billion parameters.We then fine-tuned the model through Odds Ratio Preference Optimization (ORPO) using as training data the code produced by a teacher model on the DataBench training set. The teacher model was GPT-4o, whose code was labeled preferred, while the code generated by the base model was rejected. This increased the accuracy on the development set from 71% to 85%.Our method demonstrated robust performance in answering complex questions across diverse datasets, highlighting the effectiveness of combining small LLMs with supervised fine-tuning and automated code execution for tabular question answering.

2021

pdf bib abs

Biaffine Dependency and Semantic Graph Parsing for EnhancedUniversal Dependencies
Giuseppe Attardi | Daniele Sartiano | Maria Simi
Proceedings of the 17th International Conference on Parsing Technologies and the IWPT 2021 Shared Task on Parsing into Enhanced Universal Dependencies (IWPT 2021)

This paper presents the system used in our submission to the IWPT 2021 Shared Task. This year the official evaluation metrics was ELAS, therefore dependency parsing might have been avoided as well as other pipeline stages like POS tagging and lemmatization. We nevertheless chose to deploy a combination of a dependency parser and a graph parser. The dependency parser is a biaffine parser, that uses transformers for representing input sentences, with no other feature. The graph parser is a semantic parser that exploits a similar architecture except for using a sigmoid crossentropy loss function to return multiple values for the predicted arcs. The final output is obtained by merging the output of the two parsers. The dependency parser achieves top or close to top LAS performance with respect to other systems that report results on such metrics, except on low resource languages (Tamil, Estonian, Latvian).

2020

pdf bib abs

Linear Neural Parsing and Hybrid Enhancement for Enhanced Universal Dependencies
Giuseppe Attardi | Daniele Sartiano | Maria Simi
Proceedings of the 16th International Conference on Parsing Technologies and the IWPT 2020 Shared Task on Parsing into Enhanced Universal Dependencies

To accomplish the shared task on dependency parsing we explore the use of a linear transition-based neural dependency parser as well as a combination of three of them by means of a linear tree combination algorithm. We train separate models for each language on the shared task data. We compare our base parser with two biaffine parsers and also present an ensemble combination of all five parsers, which achieves an average UAS 1.88 point lower than the top official submission. For producing the enhanced dependencies, we exploit a hybrid approach, coupling an algorithmic graph transformation of the dependency tree with predictions made by a multitask machine learning model.

pdf bib abs

Transfer Learning from Transformers to Fake News Challenge Stance Detection (FNC-1) Task
Valeriya Slovikovskaya | Giuseppe Attardi
Proceedings of the Twelfth Language Resources and Evaluation Conference

Transformer models, trained and publicly released over the last couple of years, have proved effective in many NLP tasks. We wished to test their usefulness in particular on the stance detection task. We performed experiments on the data from the Fake News Challenge Stage 1 (FNC-1). We were indeed able to improve the reported SotA on the challenge, by exploiting the generalization power of large language models based on Transformer architecture. Specifically (1) we improved the FNC-1 best performing model adding BERT sentence embedding of input sequences as a model feature, (2) we fine-tuned BERT, XLNet, and RoBERTa transformers on FNC-1 extended dataset and obtained state-of-the-art results on FNC-1 task.