Aldo Gangemi


2023

pdf bib
Towards Distribution-shift Robust Text Classification of Emotional Content
Luana Bulla | Aldo Gangemi | Misael Mongiovi’
Findings of the Association for Computational Linguistics: ACL 2023

Supervised models based on Transformers have been shown to achieve impressive performances in many natural language processing tasks. However, besides requiring a large amount of costly manually annotated data, supervised models tend to adapt to the characteristics of the training dataset, which are usually created ad-hoc and whose data distribution often differs from the one in real applications, showing significant performance degradation in real-world scenarios. We perform an extensive assessment of the out-of-distribution performances of supervised models for classification in the emotion and hate-speech detection tasks and show that NLI-based zero-shot models often outperform them, making task-specific annotation useless when the characteristics of final-user data are not known in advance. To benefit from both supervised and zero-shot approaches, we propose to fine-tune an NLI-based model on the task-specific dataset. The resulting model often outperforms all available supervised models both in distribution and out of distribution, with only a few thousand training samples.

2022

pdf bib
Uncovering Values: Detecting Latent Moral Content from Natural Language with Explainable and Non-Trained Methods
Luigi Asprino | Luana Bulla | Stefano De Giorgis | Aldo Gangemi | Ludovica Marinucci | Misael Mongiovi
Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures

Moral values as commonsense norms shape our everyday individual and community behavior. The possibility to extract moral attitude rapidly from natural language is an appealing perspective that would enable a deeper understanding of social interaction dynamics and the individual cognitive and behavioral dimension. In this work we focus on detecting moral content from natural language and we test our methods on a corpus of tweets previously labeled as containing moral values or violations, according to Moral Foundation Theory. We develop and compare two different approaches: (i) a frame-based symbolic value detector based on knowledge graphs and (ii) a zero-shot machine learning model fine-tuned on a task of Natural Language Inference (NLI) and a task of emotion detection. The final outcome from our work consists in two approaches meant to perform without the need for prior training process on a moral value detection task.

2016

pdf bib
Proceedings of the 2nd International Workshop on Natural Language Generation and the Semantic Web (WebNLG 2016)
Claire Gardent | Aldo Gangemi
Proceedings of the 2nd International Workshop on Natural Language Generation and the Semantic Web (WebNLG 2016)

2010

pdf bib
Senso Comune
Alessandro Oltramari | Guido Vetere | Maurizio Lenzerini | Aldo Gangemi | Nicola Guarino
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper introduces the general features of Senso Comune, an open knowledge base for the Italian language, focusing on the interplay of lexical and ontological knowledge, and outlining our approach to conceptual knowledge elicitation. Senso Comune consists of a machine-readable lexicon constrained by an ontological infrastructure. The idea at the basis of Senso Comune is that natural languages exist in use, and they belong to their users. In the line of Saussure's linguistics, natural languages are seen as a social product and their main strength relies on the users’ consensus. At the same time, language has specific goals: i.e. referring to entities that belong to the users’ world (be it physical or not) and that are made up in social environments where expressions are produced and understood. This usage leverages the creativity of those who produce words and try to understand them. This is the reason why ontology, i.e. a shared conceptualization of the world, can be regarded to as the soil on which the speakers' consensus may be rooted. Some final remarks concerning future work and applications are also given.

2008

pdf bib
LMM: an OWL-DL MetaModel to Represent Heterogeneous Lexical Knowledge
Davide Picca | Alfio Massimiliano Gliozzo | Aldo Gangemi
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In this paper we present a Linguistic Meta-Model (LMM) allowing a semiotic-cognitive representation of knowledge. LMM is freely available and integrates the schemata of linguistic knowledge resources, such as WordNet and FrameNet, as well as foundational ontologies, such as DOLCE and its extensions. In addition, LMM is able to deal with multilinguality and to represent individuals and facts in an open domain perspective.

2006

bib
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Nicoletta Calzolari | Khalid Choukri | Aldo Gangemi | Bente Maegaard | Joseph Mariani | Jan Odijk | Daniel Tapias
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

pdf bib
Conversion of WordNet to a standard RDF/OWL representation
Mark van Assem | Aldo Gangemi | Guus Schreiber
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper presents an overview of the work in progress at the W3C to produce a conversion of WordNet to the RDF/OWL representation language in use in the Semantic Web community. Such a standard representation is useful to provide application developers a high-quality resource and to promote interoperability. Important requirements in this conversion process are that it should be complete and should stay close to WordNet's conceptual model. The paper explains the steps taken to produce the conversion and details design decisions such as the composition of the class hierarchy and properties, the addition of suitable OWL semantics and the chosen format of the URIs. Additional topics include a strategy to incorporate OWL and RDFS semantics in one schema such that both RDF(S) infrastructure and OWL infrastructure can interpret the information correctly, problems encountered in understanding the Prolog source files and the description of the two versions that are provided (Basic and Full) to accommodate different usages of WordNet.