Aldo Gangemi

2025

GRAMMAR-LLM: Grammar-Constrained Natural Language Generation
Gabriele Tuccio | Luana Bulla | Maria Madonia | Aldo Gangemi | Misael Mongiovì
Findings of the Association for Computational Linguistics: ACL 2025

Large Language Models have achieved impressive performance across various natural language generation tasks. However, their lack of a reliable control mechanism limits their effectiveness in applications that require strict adherence to predefined taxonomies, syntactic structures, or domain-specific rules. Existing approaches, such as fine-tuning and prompting, remain insufficient to ensure compliance with these requirements, particularly in low-resource scenarios and structured text generation tasks.To address these limitations, we introduce GRAMMAR-LLM, a novel framework that integrates formal grammatical constraints into the LLM decoding process. GRAMMAR-LLM enforces syntactic correctness in linear time while maintaining expressiveness in grammar rule definition. To achieve this, we define a class of grammars, called LL(prefix), – which we show to be equivalent to LL(1) – specifically designed for their use with LLMs. These grammars are expressive enough to support common tasks such as hierarchical classification, vocabulary restriction, and structured parsing. We formally prove that LL(prefix) grammars can be transformed into LL(1) grammars in linear time, ensuring efficient processing via deterministic pushdown automata. We evaluate GRAMMAR-LLM across diverse NLP tasks, including hierarchical classification, sign language translation, and semantic parsing. Our experiments, conducted on models such as LLaMA 3 (for classification and translation) and AMRBART (for parsing), demonstrate that GRAMMAR-LLM consistently improves task performance across zero-shot, few-shot, and fine-tuned settings.

2023

pdf bib abs

Towards Distribution-shift Robust Text Classification of Emotional Content
Luana Bulla | Aldo Gangemi | Misael Mongiovi’
Findings of the Association for Computational Linguistics: ACL 2023

Supervised models based on Transformers have been shown to achieve impressive performances in many natural language processing tasks. However, besides requiring a large amount of costly manually annotated data, supervised models tend to adapt to the characteristics of the training dataset, which are usually created ad-hoc and whose data distribution often differs from the one in real applications, showing significant performance degradation in real-world scenarios. We perform an extensive assessment of the out-of-distribution performances of supervised models for classification in the emotion and hate-speech detection tasks and show that NLI-based zero-shot models often outperform them, making task-specific annotation useless when the characteristics of final-user data are not known in advance. To benefit from both supervised and zero-shot approaches, we propose to fine-tune an NLI-based model on the task-specific dataset. The resulting model often outperforms all available supervised models both in distribution and out of distribution, with only a few thousand training samples.

2022

pdf bib abs

Uncovering Values: Detecting Latent Moral Content from Natural Language with Explainable and Non-Trained Methods
Luigi Asprino | Luana Bulla | Stefano De Giorgis | Aldo Gangemi | Ludovica Marinucci | Misael Mongiovi
Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures

Moral values as commonsense norms shape our everyday individual and community behavior. The possibility to extract moral attitude rapidly from natural language is an appealing perspective that would enable a deeper understanding of social interaction dynamics and the individual cognitive and behavioral dimension. In this work we focus on detecting moral content from natural language and we test our methods on a corpus of tweets previously labeled as containing moral values or violations, according to Moral Foundation Theory. We develop and compare two different approaches: (i) a frame-based symbolic value detector based on knowledge graphs and (ii) a zero-shot machine learning model fine-tuned on a task of Natural Language Inference (NLI) and a task of emotion detection. The final outcome from our work consists in two approaches meant to perform without the need for prior training process on a moral value detection task.

2016

pdf bib

Proceedings of the 2nd International Workshop on Natural Language Generation and the Semantic Web (WebNLG 2016)
Claire Gardent | Aldo Gangemi
Proceedings of the 2nd International Workshop on Natural Language Generation and the Semantic Web (WebNLG 2016)

2010

pdf bib abs

Senso Comune
Alessandro Oltramari | Guido Vetere | Maurizio Lenzerini | Aldo Gangemi | Nicola Guarino
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper introduces the general features of Senso Comune, an open knowledge base for the Italian language, focusing on the interplay of lexical and ontological knowledge, and outlining our approach to conceptual knowledge elicitation. Senso Comune consists of a machine-readable lexicon constrained by an ontological infrastructure. The idea at the basis of Senso Comune is that natural languages exist in use, and they belong to their users. In the line of Saussure's linguistics, natural languages are seen as a social product and their main strength relies on the users consensus. At the same time, language has specific goals: i.e. referring to entities that belong to the users world (be it physical or not) and that are made up in social environments where expressions are produced and understood. This usage leverages the creativity of those who produce words and try to understand them. This is the reason why ontology, i.e. a shared conceptualization of the world, can be regarded to as the soil on which the speakers' consensus may be rooted. Some final remarks concerning future work and applications are also given.

2008

pdf bib abs

LMM: an OWL-DL MetaModel to Represent Heterogeneous Lexical Knowledge
Davide Picca | Alfio Massimiliano Gliozzo | Aldo Gangemi
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In this paper we present a Linguistic Meta-Model (LMM) allowing a semiotic-cognitive representation of knowledge. LMM is freely available and integrates the schemata of linguistic knowledge resources, such as WordNet and FrameNet, as well as foundational ontologies, such as DOLCE and its extensions. In addition, LMM is able to deal with multilinguality and to represent individuals and facts in an open domain perspective.

2006

bib

Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Nicoletta Calzolari | Khalid Choukri | Aldo Gangemi | Bente Maegaard | Joseph Mariani | Jan Odijk | Daniel Tapias
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

pdf bib abs

Conversion of WordNet to a standard RDF/OWL representation
Mark van Assem | Aldo Gangemi | Guus Schreiber
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper presents an overview of the work in progress at the W3C to produce a conversion of WordNet to the RDF/OWL representation language in use in the Semantic Web community. Such a standard representation is useful to provide application developers a high-quality resource and to promote interoperability. Important requirements in this conversion process are that it should be complete and should stay close to WordNet's conceptual model. The paper explains the steps taken to produce the conversion and details design decisions such as the composition of the class hierarchy and properties, the addition of suitable OWL semantics and the chosen format of the URIs. Additional topics include a strategy to incorporate OWL and RDFS semantics in one schema such that both RDF(S) infrastructure and OWL infrastructure can interpret the information correctly, problems encountered in understanding the Prolog source files and the description of the two versions that are provided (Basic and Full) to accommodate different usages of WordNet.

Venues

Fix author

Aldo Gangemi

2025

2023

2022

2016

2010

2008

2006

Co-authors

Venues