Roberto Zamparelli


An LSTM Adaptation Study of (Un)grammaticality
Shammur Absar Chowdhury | Roberto Zamparelli
Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

We propose a novel approach to the study of how artificial neural network perceive the distinction between grammatical and ungrammatical sentences, a crucial task in the growing field of synthetic linguistics. The method is based on performance measures of language models trained on corpora and fine-tuned with either grammatical or ungrammatical sentences, then applied to (different types of) grammatical or ungrammatical sentences. The results show that both in the difficult and highly symmetrical task of detecting subject islands and in the more open CoLA dataset, grammatical sentences give rise to better scores than ungrammatical ones, possibly because they can be better integrated within the body of linguistic structural knowledge that the language model has accumulated.


RNN Simulations of Grammaticality Judgments on Long-distance Dependencies
Shammur Absar Chowdhury | Roberto Zamparelli
Proceedings of the 27th International Conference on Computational Linguistics

The paper explores the ability of LSTM networks trained on a language modeling task to detect linguistic structures which are ungrammatical due to extraction violations (extra arguments and subject-relative clause island violations), and considers its implications for the debate on language innatism. The results show that the current RNN model can correctly classify (un)grammatical sentences, in certain conditions, but it is sensitive to linguistic processing factors and probably ultimately unable to induce a more abstract notion of grammaticality, at least in the domain we tested.


SemEval-2014 Task 1: Evaluation of Compositional Distributional Semantic Models on Full Sentences through Semantic Relatedness and Textual Entailment
Marco Marelli | Luisa Bentivogli | Marco Baroni | Raffaella Bernardi | Stefano Menini | Roberto Zamparelli
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

Frege in Space: A Program for Composition Distributional Semantics
Marco Baroni | Raffaella Bernardi | Roberto Zamparelli
Linguistic Issues in Language Technology, Volume 9, 2014 - Perspectives on Semantic Representations for Textual Inference

The lexicon of any natural language encodes a huge number of distinct word meanings. Just to understand this article, you will need to know what thousands of words mean. The space of possible sentential meanings is infinite: In this article alone, you will encounter many sentences that express ideas you have never heard before, we hope. Statistical semantics has addressed the issue of the vastness of word meaning by proposing methods to harvest meaning automatically from large collections of text (corpora). Formal semantics in the Fregean tradition has developed methods to account for the infinity of sentential meaning based on the crucial insight of compositionality, the idea that meaning of sentences is built incrementally by combining the meanings of their constituents. This article sketches a new approach to semantics that brings together ideas from statistical and formal semantics to account, in parallel, for the richness of lexical meaning and the combinatorial power of sentential semantics. We adopt, in particular, the idea that word meaning can be approximated by the patterns of co-occurrence of words in corpora from statistical semantics, and the idea that compositionality can be captured in terms of a syntax-driven calculus of function application from formal semantics.

A SICK cure for the evaluation of compositional distributional semantic models
Marco Marelli | Stefano Menini | Marco Baroni | Luisa Bentivogli | Raffaella Bernardi | Roberto Zamparelli
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Shared and internationally recognized benchmarks are fundamental for the development of any computational system. We aim to help the research community working on compositional distributional semantic models (CDSMs) by providing SICK (Sentences Involving Compositional Knowldedge), a large size English benchmark tailored for them. SICK consists of about 10,000 English sentence pairs that include many examples of the lexical, syntactic and semantic phenomena that CDSMs are expected to account for, but do not require dealing with other aspects of existing sentential data sets (idiomatic multiword expressions, named entities, telegraphic language) that are not within the scope of CDSMs. By means of crowdsourcing techniques, each pair was annotated for two crucial semantic tasks: relatedness in meaning (with a 5-point rating scale as gold score) and entailment relation between the two elements (with three possible gold labels: entailment, contradiction, and neutral). The SICK data set was used in SemEval-2014 Task 1, and it freely available for research purposes.


Studying the Recursive Behaviour of Adjectival Modification with Compositional Distributional Semantics
Eva Maria Vecchi | Roberto Zamparelli | Marco Baroni
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

Compositional-ly Derived Representations of Morphologically Complex Words in Distributional Semantics
Angeliki Lazaridou | Marco Marelli | Roberto Zamparelli | Marco Baroni
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Proceedings of the IWCS 2013 Workshop Towards a Formal Distributional Semantics
Aurelie Herbelot | Roberto Zamparelli | Gemma Boleda
Proceedings of the IWCS 2013 Workshop Towards a Formal Distributional Semantics


(Linear) Maps of the Impossible: Capturing Semantic Anomalies in Distributional Space
Eva Maria Vecchi | Marco Baroni | Roberto Zamparelli
Proceedings of the Workshop on Distributional Semantics and Compositionality


Nouns are Vectors, Adjectives are Matrices: Representing Adjective-Noun Constructions in Semantic Space
Marco Baroni | Roberto Zamparelli
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing