Mattia Proietti

2025

Leveraging LLMs to Build a Semi-synthetic Dataset for Legal Information Retrieval: A Case Study on the Italian Civil Code and GPT4-O
Mattia Proietti | Lucia C. Passaro | Alessandro Lenci
Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)

2022

pdf bib abs

Does BERT Recognize an Agent? Modeling Dowty’s Proto-Roles with Contextual Embeddings
Mattia Proietti | Gianluca Lebani | Alessandro Lenci
Proceedings of the 29th International Conference on Computational Linguistics

Contextual embeddings build multidimensional representations of word tokens based on their context of occurrence. Such models have been shown to achieve a state-of-the-art performance on a wide variety of tasks. Yet, the community struggles in understanding what kind of semantic knowledge these representations encode. We report a series of experiments aimed at investigating to what extent one of such models, BERT, is able to infer the semantic relations that, according to Dowty’s Proto-Roles theory, a verbal argument receives by virtue of its role in the event described by the verb. This hypothesis were put to test by learning a linear mapping from the BERT’s verb embeddings to an interpretable space of semantic properties built from the linguistic dataset by White et al. (2016). In a first experiment we tested whether the semantic properties inferred from a typed version of the BERT embeddings would be more linguistically plausible than those produced by relying on static embeddings. We then move to evaluate the semantic properties inferred from the contextual embeddings both against those available in the original dataset, as well as by assessing their ability to model the semantic properties possessed by the agent of the verbs participating in the so-called causative alternation.

Co-authors

Venues

CLiC-it1
COLING1

Fix author