Roberto Carlini

2021

On the evolution of syntactic information encoded by BERT’s contextualized representations
Laura Pérez-Mayos | Roberto Carlini | Miguel Ballesteros | Leo Wanner
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

The adaptation of pretrained language models to solve supervised tasks has become a baseline in NLP, and many recent works have focused on studying how linguistic information is encoded in the pretrained sentence representations. Among other information, it has been shown that entire syntax trees are implicitly embedded in the geometry of such models. As these models are often fine-tuned, it becomes increasingly important to understand how the encoded knowledge evolves along the fine-tuning. In this paper, we analyze the evolution of the embedded syntax trees along the fine-tuning process of BERT for six different tasks, covering all levels of the linguistic structure. Experimental results show that the encoded syntactic information is forgotten (PoS tagging), reinforced (dependency and constituency parsing) or preserved (semantics-related tasks) in different ways along the fine-tuning process depending on the task.

2018

pdf bib

Generation of a Spanish Artificial Collocation Error Corpus
Sara Rodríguez-Fernández | Roberto Carlini | Leo Wanner
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib abs

FORGe at SemEval-2017 Task 9: Deep sentence generation based on a sequence of graph transducers
Simon Mille | Roberto Carlini | Alicia Burga | Leo Wanner
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

We present the contribution of Universitat Pompeu Fabra’s NLP group to the SemEval Task 9.2 (AMR-to-English Generation). The proposed generation pipeline comprises: (i) a series of rule-based graph-transducers for the syntacticization of the input graphs and the resolution of morphological agreements, and (ii) an off-the-shelf statistical linearization component.

2016

pdf bib abs

Example-based Acquisition of Fine-grained Collocation Resources
Sara Rodríguez-Fernández | Roberto Carlini | Luis Espinosa Anke | Leo Wanner
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Collocations such as “heavy rain” or “make [a] decision”, are combinations of two elements where one (the base) is freely chosen, while the choice of the other (collocate) is restricted, depending on the base. Collocations present difficulties even to advanced language learners, who usually struggle to find the right collocate to express a particular meaning, e.g., both “heavy” and “strong” express the meaning ‘intense’, but while “rain” selects “heavy”, “wind” selects “strong”. Lexical Functions (LFs) describe the meanings that hold between the elements of collocations, such as ‘intense’, ‘perform’, ‘create’, ‘increase’, etc. Language resources with semantically classified collocations would be of great help for students, however they are expensive to build, since they are manually constructed, and scarce. We present an unsupervised approach to the acquisition and semantic classification of collocations according to LFs, based on word embeddings in which, given an example of a collocation for each of the target LFs and a set of bases, the system retrieves a list of collocates for each base and LF.

pdf bib

Semantics-Driven Recognition of Collocations Using Word Embeddings
Sara Rodríguez-Fernández | Luis Espinosa-Anke | Roberto Carlini | Leo Wanner
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Roberto Carlini

2021

2018

2017

2016

2015

2014

2013

Co-authors

Venues