Mario Giulianelli


2021

pdf bib
Is Information Density Uniform in Task-Oriented Dialogues?
Mario Giulianelli | Arabella Sinclair | Raquel Fernández
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

The Uniform Information Density principle states that speakers plan their utterances to reduce fluctuations in the density of the information transmitted. In this paper, we test whether, and within which contextual units this principle holds in task-oriented dialogues. We show that there is evidence supporting the principle in written dialogues where participants play a cooperative reference game as well as in spoken dialogues involving instruction giving and following. Our study underlines the importance of identifying the relevant contextual components, showing that information content increases particularly within topically and referentially related contextual units.

pdf bib
Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP
Jasmijn Bastings | Yonatan Belinkov | Emmanuel Dupoux | Mario Giulianelli | Dieuwke Hupkes | Yuval Pinter | Hassan Sajjad
Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP

pdf bib
Grammatical Profiling for Semantic Change Detection
Andrey Kutuzov | Lidia Pivovarova | Mario Giulianelli
Proceedings of the 25th Conference on Computational Natural Language Learning

Semantics, morphology and syntax are strongly interdependent. However, the majority of computational methods for semantic change detection use distributional word representations which encode mostly semantics. We investigate an alternative method, grammatical profiling, based entirely on changes in the morphosyntactic behaviour of words. We demonstrate that it can be used for semantic change detection and even outperforms some distributional semantic methods. We present an in-depth qualitative and quantitative analysis of the predictions made by our grammatical profiling system, showing that they are plausible and interpretable.

pdf bib
Analysing Human Strategies of Information Transmission as a Function of Discourse Context
Mario Giulianelli | Raquel Fernández
Proceedings of the 25th Conference on Computational Natural Language Learning

Speakers are thought to use rational information transmission strategies for efficient communication (Genzel and Charniak, 2002; Aylett and Turk, 2004; Jaeger and Levy, 2007). Previous work analysing these strategies in sentence production has failed to take into account how the information content of sentences varies as a function of the available discourse context. In this study, we estimate sentence information content within discourse context. We find that speakers transmit information at a stable rate—i.e., rationally—in English newspaper articles but that this rate decreases in spoken open domain and written task-oriented dialogues. We also observe that speakers’ choices are not oriented towards local uniformity of information, which is another hypothesised rational strategy. We suggest that a more faithful model of communication should explicitly include production costs and goal-oriented rewards.

2020

pdf bib
Refer, Reuse, Reduce: Generating Subsequent References in Visual and Conversational Contexts
Ece Takmaz | Mario Giulianelli | Sandro Pezzelle | Arabella Sinclair | Raquel Fernández
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Dialogue participants often refer to entities or situations repeatedly within a conversation, which contributes to its cohesiveness. Subsequent references exploit the common ground accumulated by the interlocutors and hence have several interesting properties, namely, they tend to be shorter and reuse expressions that were effective in previous mentions. In this paper, we tackle the generation of first and subsequent references in visually grounded dialogue. We propose a generation model that produces referring utterances grounded in both the visual and the conversational context. To assess the referring effectiveness of its output, we also implement a reference resolution system. Our experiments and analyses show that the model produces better, more effective referring utterances than a model not grounded in the dialogue context, and generates subsequent references that exhibit linguistic patterns akin to humans.

pdf bib
Analysing Lexical Semantic Change with Contextualised Word Representations
Mario Giulianelli | Marco Del Tredici | Raquel Fernández
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

This paper presents the first unsupervised approach to lexical semantic change that makes use of contextualised word representations. We propose a novel method that exploits the BERT neural language model to obtain representations of word usages, clusters these representations into usage types, and measures change along time with three proposed metrics. We create a new evaluation dataset and show that the model representations and the detected semantic shifts are positively correlated with human judgements. Our extensive qualitative analysis demonstrates that our method captures a variety of synchronic and diachronic linguistic phenomena. We expect our work to inspire further research in this direction.

pdf bib
UiO-UvA at SemEval-2020 Task 1: Contextualised Embeddings for Lexical Semantic Change Detection
Andrey Kutuzov | Mario Giulianelli
Proceedings of the Fourteenth Workshop on Semantic Evaluation

We apply contextualised word embeddings to lexical semantic change detection in the SemEval-2020 Shared Task 1. This paper focuses on Subtask 2, ranking words by the degree of their semantic drift over time. We analyse the performance of two contextualising architectures (BERT and ELMo) and three change detection algorithms. We find that the most effective algorithms rely on the cosine similarity between averaged token embeddings and the pairwise distances between token embeddings. They outperform strong baselines by a large margin (in the post-evaluation phase, we have the best Subtask 2 submission for SemEval-2020 Task 1), but interestingly, the choice of a particular algorithm depends on the distribution of gold scores in the test set.

2018

pdf bib
Under the Hood: Using Diagnostic Classifiers to Investigate and Improve how Language Models Track Agreement Information
Mario Giulianelli | Jack Harding | Florian Mohnert | Dieuwke Hupkes | Willem Zuidema
Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

How do neural language models keep track of number agreement between subject and verb? We show that ‘diagnostic classifiers’, trained to predict number from the internal states of a language model, provide a detailed understanding of how, when, and where this information is represented. Moreover, they give us insight into when and where number information is corrupted in cases where the language model ends up making agreement errors. To demonstrate the causal role played by the representations we find, we then use agreement information to influence the course of the LSTM during the processing of difficult sentences. Results from such an intervention reveal a large increase in the language model’s accuracy. Together, these results show that diagnostic classifiers give us an unrivalled detailed look into the representation of linguistic information in neural models, and demonstrate that this knowledge can be used to improve their performance.