2024
pdf
bib
abs
Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data
Parth Patwa
|
Simone Filice
|
Zhiyu Chen
|
Giuseppe Castellucci
|
Oleg Rokhlenko
|
Shervin Malmasi
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Large Language Models (LLMs) operating in 0-shot or few-shot settings achieve competitive results in Text Classification tasks. In-Context Learning (ICL) typically achieves better accuracy than the 0-shot setting, but it pays in terms of efficiency, due to the longer input prompt. In this paper, we propose a strategy to make LLMs as efficient as 0-shot text classifiers, while getting comparable or better accuracy than ICL. Our solution targets the low resource setting, i.e., when only 4 examples per class are available. Using a single LLM and few-shot real data we perform a sequence of generation, filtering and Parameter-Efficient Fine-Tuning steps to create a robust and efficient classifier. Experimental results show that our approach leads to competitive results on multiple text classification datasets.
2023
pdf
bib
abs
Faithful Low-Resource Data-to-Text Generation through Cycle Training
Zhuoer Wang
|
Marcus Collins
|
Nikhita Vedula
|
Simone Filice
|
Shervin Malmasi
|
Oleg Rokhlenko
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Methods to generate text from structured data have advanced significantly in recent years, primarily due to fine-tuning of pre-trained language models on large datasets. However, such models can fail to produce output faithful to the input data, particularly on out-of-domain data. Sufficient annotated data is often not available for specific domains, leading us to seek an unsupervised approach to improve the faithfulness of output text. Since the problem is fundamentally one of consistency between the representations of the structured data and text, we evaluate the effectiveness of cycle training in this work. Cycle training uses two models which are inverses of each other: one that generates text from structured data, and one which generates the structured data from natural language text. We show that cycle training, when initialized with a small amount of supervised data (100 samples in our case), achieves nearly the same performance as fully supervised approaches for the data-to-text generation task on the WebNLG, E2E, WTQ, and WSQL datasets. We perform extensive empirical analysis with automated evaluation metrics and a newly designed human evaluation schema to reveal different cycle training strategies’ effectiveness of reducing various types of generation errors. Our code is publicly available at
https://github.com/Edillower/CycleNLG.
2022
pdf
bib
abs
Learning to Generate Examples for Semantic Processing Tasks
Danilo Croce
|
Simone Filice
|
Giuseppe Castellucci
|
Roberto Basili
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Even if recent Transformer-based architectures, such as BERT, achieved impressive results in semantic processing tasks, their fine-tuning stage still requires large scale training resources. Usually, Data Augmentation (DA) techniques can help to deal with low resource settings. In Text Classification tasks, the objective of DA is the generation of well-formed sentences that i) represent the desired task category and ii) are novel with respect to existing sentences. In this paper, we propose a neural approach to automatically learn to generate new examples using a pre-trained sequence-to-sequence model. We first learn a task-oriented similarity function that we use to pair similar examples. Then, we use these example pairs to train a model to generate examples. Experiments in low resource settings show that augmenting the training material with the proposed strategy systematically improves the results on text classification and natural language inference tasks by up to 10% accuracy, outperforming existing DA approaches.
2021
pdf
bib
abs
Learning to Solve NLP Tasks in an Incremental Number of Languages
Giuseppe Castellucci
|
Simone Filice
|
Danilo Croce
|
Roberto Basili
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
In real scenarios, a multilingual model trained to solve NLP tasks on a set of languages can be required to support new languages over time. Unfortunately, the straightforward retraining on a dataset containing annotated examples for all the languages is both expensive and time-consuming, especially when the number of target languages grows. Moreover, the original annotated material may no longer be available due to storage or business constraints. Re-training only with the new language data will inevitably result in Catastrophic Forgetting of previously acquired knowledge. We propose a Continual Learning strategy that updates a model to support new languages over time, while maintaining consistent results on previously learned languages. We define a Teacher-Student framework where the existing model “teaches” to a student model its knowledge about the languages it supports, while the student is also trained on a new language. We report an experimental evaluation in several tasks including Sentence Classification, Relational Learning and Sequence Labeling.
pdf
bib
abs
VoiSeR: A New Benchmark for Voice-Based Search Refinement
Simone Filice
|
Giuseppe Castellucci
|
Marcus Collins
|
Eugene Agichtein
|
Oleg Rokhlenko
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Voice assistants, e.g., Alexa or Google Assistant, have dramatically improved in recent years. Supporting voice-based search, exploration, and refinement are fundamental tasks for voice assistants, and remain an open challenge. For example, when using voice to search an online shopping site, a user often needs to refine their search by some aspect or facet. This common user intent is usually available through a “filter-by” interface on online shopping websites, but is challenging to support naturally via voice, as the intent of refinements must be interpreted in the context of the original search, the initial results, and the available product catalogue facets. To our knowledge, no benchmark dataset exists for training or validating such contextual search understanding models. To bridge this gap, we introduce the first large-scale dataset of voice-based search refinements, VoiSeR, consisting of about 10,000 search refinement utterances, collected using a novel crowdsourcing task. These utterances are intended to refine a previous search, with respect to a search facet or attribute (e.g., brand, color, review rating, etc.), and are manually annotated with the specific intent. This paper reports qualitative and empirical insights into the most common and challenging types of refinements that a voice-based conversational search system must support. As we show, VoiSeR can support research in conversational query understanding, contextual user intent prediction, and other conversational search topics to facilitate the development of conversational search systems.
2020
pdf
bib
Proceedings of the First International Workshop on Natural Language Processing Beyond Text
Giuseppe Castellucci
|
Simone Filice
|
Soujanya Poria
|
Erik Cambria
|
Lucia Specia
Proceedings of the First International Workshop on Natural Language Processing Beyond Text
2017
pdf
bib
abs
Deep Learning in Semantic Kernel Spaces
Danilo Croce
|
Simone Filice
|
Giuseppe Castellucci
|
Roberto Basili
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Kernel methods enable the direct usage of structured representations of textual data during language learning and inference tasks. Expressive kernels, such as Tree Kernels, achieve excellent performance in NLP. On the other side, deep neural networks have been demonstrated effective in automatically learning feature representations during training. However, their input is tensor data, i.e., they can not manage rich structured information. In this paper, we show that expressive kernels and deep neural networks can be combined in a common framework in order to (i) explicitly model structured information and (ii) learn non-linear decision functions. We show that the input layer of a deep architecture can be pre-trained through the application of the Nystrom low-rank approximation of kernel spaces. The resulting “kernelized” neural network achieves state-of-the-art accuracy in three different tasks.
pdf
bib
abs
KeLP at SemEval-2017 Task 3: Learning Pairwise Patterns in Community Question Answering
Simone Filice
|
Giovanni Da San Martino
|
Alessandro Moschitti
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)
This paper describes the KeLP system participating in the SemEval-2017 community Question Answering (cQA) task. The system is a refinement of the kernel-based sentence pair modeling we proposed for the previous year challenge. It is implemented within the Kernel-based Learning Platform called KeLP, from which we inherit the team’s name. Our primary submission ranked first in subtask A, and third in subtasks B and C, being the only systems appearing in the top-3 ranking for all the English subtasks. This shows that the proposed framework, which has minor variations among the three subtasks, is extremely flexible and effective in tackling learning tasks defined on sentence pairs.
2016
pdf
bib
Learning to Recognize Ancillary Information for Automatic Paraphrase Identification
Simone Filice
|
Alessandro Moschitti
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
pdf
bib
KeLP at SemEval-2016 Task 3: Learning Semantic Relations between Questions and Answers
Simone Filice
|
Danilo Croce
|
Alessandro Moschitti
|
Roberto Basili
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)
2015
pdf
bib
Global Thread-level Inference for Comment Classification in Community Question Answering
Shafiq Joty
|
Alberto Barrón-Cedeño
|
Giovanni Da San Martino
|
Simone Filice
|
Lluís Màrquez
|
Alessandro Moschitti
|
Preslav Nakov
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
pdf
bib
Structural Representations for Learning Relations between Pairs of Texts
Simone Filice
|
Giovanni Da San Martino
|
Alessandro Moschitti
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
pdf
bib
Thread-Level Information for Comment Classification in Community Question Answering
Alberto Barrón-Cedeño
|
Simone Filice
|
Giovanni Da San Martino
|
Shafiq Joty
|
Lluís Màrquez
|
Preslav Nakov
|
Alessandro Moschitti
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
pdf
bib
KeLP: a Kernel-based Learning Platform for Natural Language Processing
Simone Filice
|
Giuseppe Castellucci
|
Danilo Croce
|
Roberto Basili
Proceedings of ACL-IJCNLP 2015 System Demonstrations
pdf
bib
QCRI: Answer Selection for Community Question Answering - Experiments for Arabic and English
Massimo Nicosia
|
Simone Filice
|
Alberto Barrón-Cedeño
|
Iman Saleh
|
Hamdy Mubarak
|
Wei Gao
|
Preslav Nakov
|
Giovanni Da San Martino
|
Alessandro Moschitti
|
Kareem Darwish
|
Lluís Màrquez
|
Shafiq Joty
|
Walid Magdy
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)
2014
pdf
bib
UNITOR: Aspect Based Sentiment Analysis with Structured Learning
Giuseppe Castellucci
|
Simone Filice
|
Danilo Croce
|
Roberto Basili
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)
2013
pdf
bib
UNITOR: Combining Syntactic and Semantic Kernels for Twitter Sentiment Analysis
Giuseppe Castellucci
|
Simone Filice
|
Danilo Croce
|
Roberto Basili
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)