2022
pdf
bib
abs
Does BERT Recognize an Agent? Modeling Dowty’s Proto-Roles with Contextual Embeddings
Mattia Proietti
|
Gianluca Lebani
|
Alessandro Lenci
Proceedings of the 29th International Conference on Computational Linguistics
Contextual embeddings build multidimensional representations of word tokens based on their context of occurrence. Such models have been shown to achieve a state-of-the-art performance on a wide variety of tasks. Yet, the community struggles in understanding what kind of semantic knowledge these representations encode. We report a series of experiments aimed at investigating to what extent one of such models, BERT, is able to infer the semantic relations that, according to Dowty’s Proto-Roles theory, a verbal argument receives by virtue of its role in the event described by the verb. This hypothesis were put to test by learning a linear mapping from the BERT’s verb embeddings to an interpretable space of semantic properties built from the linguistic dataset by White et al. (2016). In a first experiment we tested whether the semantic properties inferred from a typed version of the BERT embeddings would be more linguistically plausible than those produced by relying on static embeddings. We then move to evaluate the semantic properties inferred from the contextual embeddings both against those available in the original dataset, as well as by assessing their ability to model the semantic properties possessed by the agent of the verbs participating in the so-called causative alternation.
2016
pdf
bib
Lexical Variability and Compositionality: Investigating Idiomaticity with Distributional Semantic Models
Marco Silvio Giuseppe Senaldi
|
Gianluca E. Lebani
|
Alessandro Lenci
Proceedings of the 12th Workshop on Multiword Expressions
pdf
bib
abs
“Beware the Jabberwock, dear reader!” Testing the distributional reality of construction semantics
Gianluca Lebani
|
Alessandro Lenci
Proceedings of the 5th Workshop on Cognitive Aspects of the Lexicon (CogALex - V)
Notwithstanding the success of the notion of construction, the computational tradition still lacks a way to represent the semantic content of these linguistic entities. Here we present a simple corpus-based model implementing the idea that the meaning of a syntactic construction is intimately related to the semantics of its typical verbs. It is a two-step process, that starts by identifying the typical verbs occurring with a given syntactic construction and building their distributional vectors. We then calculated the weighted centroid of these vectors in order to derive the distributional signature of a construction. In order to assess the goodness of our approach, we replicated the priming effect described by Johnson and Golberg (2013) as a function of the semantic distance between a construction and its prototypical verbs. Additional support for our view comes from a regression analysis showing that our distributional information can be used to model behavioral data collected with a crowdsourced elicitation experiment.
pdf
bib
abs
LexFr: Adapting the LexIt Framework to Build a Corpus-based French Subcategorization Lexicon
Giulia Rambelli
|
Gianluca Lebani
|
Laurent Prévot
|
Alessandro Lenci
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
This paper introduces LexFr, a corpus-based French lexical resource built by adapting the framework LexIt, originally developed to describe the combinatorial potential of Italian predicates. As in the original framework, the behavior of a group of target predicates is characterized by a series of syntactic (i.e., subcategorization frames) and semantic (i.e., selectional preferences) statistical information (a.k.a. distributional profiles) whose extraction process is mostly unsupervised. The first release of LexFr includes information for 2,493 verbs, 7,939 nouns and 2,628 adjectives. In these pages we describe the adaptation process and evaluated the final resource by comparing the information collected for 20 test verbs against the information available in a gold standard dictionary. In the best performing setting, we obtained 0.74 precision, 0.66 recall and 0.70 F-measure.
2014
pdf
bib
abs
Bootstrapping an Italian VerbNet: data-driven analysis of verb alternations
Gianluca Lebani
|
Veronica Viola
|
Alessandro Lenci
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
The goal of this paper is to propose a classification of the syntactic alternations admitted by the most frequent Italian verbs. The data-driven two-steps procedure exploited and the structure of the identified classes of alternations are presented in depth and discussed. Even if this classification has been developed with a practical application in mind, namely the semi-automatic building of a VerbNet-like lexicon for Italian verbs, partly following the methodology proposed in the context of the VerbNet project, its availability may have a positive impact on several related research topics and Natural Language Processing tasks
pdf
bib
abs
Choosing which to use? A study of distributional models for nominal lexical semantic classification
Lauren Romeo
|
Gianluca Lebani
|
Núria Bel
|
Alessandro Lenci
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
This paper empirically evaluates the performances of different state-of-the-art distributional models in a nominal lexical semantic classification task. We consider models that exploit various types of distributional features, which thereby provide different representations of nominal behavior in context. The experiments presented in this work demonstrate the advantages and disadvantages of each model considered. This analysis also considers a combined strategy that we found to be capable of leveraging the bottlenecks of each model, especially when large robust data is not available.
2010
pdf
bib
A Feature Type Classification for Therapeutic Purposes: A Preliminary Evaluation with Non-Expert Speakers
Gianluca E. Lebani
|
Emanuele Pianta
Proceedings of the Fourth Linguistic Annotation Workshop
pdf
bib
Exploiting Lexical Resources for Therapeutic Purposes: the Case of WordNet and STaRS.sys
Gianluca E. Lebani
|
Emanuele Pianta
Proceedings of the 2nd Workshop on Cognitive Aspects of the Lexicon