Proceedings of the Probability and Meaning Conference (PaM 2020)
Certain conditionals have something other than a clause as their consequent: their antecedent if-clauses are ‘adverbial clauses’ without a verb. We argue that they function in a way already seen for those with clausal consequents, despite lacking the content we might expect for the formation of a conditional. The use of the if-clause with sub-clausal consequents is feasible thanks to the fact that this function does not depend on the consequent content, and so is not impeded when the consequent does not provide a proposition, question or imperative. To support this we provide meaning rules for conditionals in terms of information state updates, letting the same construction play out in different ways depending on context and content.
In this paper, we propose a probabilistic model of social signalling which adopts a persona-based account of social meaning. We use this model to develop a socio-semantic theory of conventionalised reasoning patterns, known as topoi. On this account the social meaning of a topos, as conveyed in a argument, is based on the set of idealogically-related topoi it indicates in context. We draw a connection between the role of personae in social meaning and the category adjustment effect, a well-known psychological phenomenon in which the representation of a stimulus is biased in the direction of the category in which it falls. Finally, we situate the interpretation of social signals as an update to the information state of an agent in a formal TTR model of dialogue.
The following paper presents a formal model for the description of dogwhistles. Dogwhistles are a class of terms or expressions often used in political discourse that are used with the goal of being interpreted in different ways by different communities. The model presented here describes this phenomenon using a variation on the Social Meaning Games framework that uses probability distributions over possible interpretation functions as well as RSA/IBR reasoning.
Conditional utterances can be used in discourse as answers to regular, non-conditional questions in situations of partial knowledge of the answerer. We claim that the probabilities assigned to possible epistemic states of A are a measure of the utility of conditional answers. A second criterion that makes a conditional answer ‘if p, then q’ relevant has to do with the dependency between p and q that is conveyed in the statement. A conditional answer counts as relevant when this dependency leads the question asker to shift from a decision problem about q to an alternative, easier, decision problem about p.
Modern semantic analyses of epistemic language (incl. the modals must and might) can be characterized by the following ‘credence assumption’: speakers have full certainty regarding the propositions that structure their epistemic state. Intuitively, however: a) speakers have graded, rather than categorical, commitment to these propositions, which are often never fully and explicitly articulated; b) listeners have higher-order uncertainty about this speaker uncertainty; c) must p is used to communicate speaker commitment to some conclusion p and to indicate speaker commitment to the premises that condition the conclusion. I explore the consequences of relaxing the credence assumption by extending the argument system semantic framework first proposed by Stone (1994) to a Bayesian probabilistic framework of modeling pragmatic interpretation (Goodman and Frank, 2016). The analysis makes desirable predictions regarding the behavior and interpretation of must, and it suggests a new way of considering the nature of context and communicative exchange.
Functional Distributional Semantics provides a computationally tractable framework for learning truth-conditional semantics from a corpus. Previous work in this framework has provided a probabilistic version of first-order logic, recasting quantification as Bayesian inference. In this paper, I show how the previous formulation gives trivial truth values when a precise quantifier is used with vague predicates. I propose an improved account, avoiding this problem by treating a vague predicate as a distribution over precise predicates. I connect this account to recent work in the Rational Speech Acts framework on modelling generic quantification, and I extend this to modelling donkey sentences. Finally, I explain how the generic quantifier can be both pragmatically complex and yet computationally simpler than precise quantifiers.
The major shortcomings of using neural networks with situated agents are that in incremental interaction very few learning examples are available and that their visual sensory representations are quite different from image caption datasets. In this work we adapt and evaluate a few-shot learning approach, Matching Networks (Vinyals et al., 2016), to conversational strategies of a robot interacting with a human tutor in order to efficiently learn to categorise objects that are presented to it and also investigate to what degree transfer learning from pre-trained models on images from different contexts can improve its performance. We discuss the implications of such learning on the nature of semantic representations the system has learned.
We present a formal semantics (a version of Type Theory with Records) which places classifiers of perceptual information at the core of semantics. Using this framework, we present an account of the interpretation and classification of utterances referring to perceptually available situations (such as visual scenes). The account improves on previous work by clarifying the role of classifiers in a hybrid semantics combining statistical/neural classifiers with logical/inferential aspects of meaning. The account covers both discrete and probabilistic classification, thereby enabling learning, vagueness and other non-discrete linguistic phenomena.
Judgements about communicative agents evolve over the course of interactions both in how individuals are judged for testimonial reliability and for (ideological) trustworthiness. This paper combines a theory of social meaning and persona with a theory of reliability within a game-theoretic view of communication, giving a formal model involving interactional histories, repeated game models and ways of evaluating social meaning and trustworthiness.
Henderson and McCready 2017, 2018, 2019 build a novel theory of so-called ‘dogwhistle’ communication by extending the social meaning games of Burnett 2017. This work reports on an ongoing project to build systems to model the evolution of dogwhistle communication in a population based on probability monads (Erwig and Kollmansberger, 2006; Kidd, 2007). The ultimate results will be useful not just for dogwhistles, but modeling the diffusion and evolution of social meaning in populations in general. The initial results presented here is a computational implementation of Henderson and McCready 2018, which will serve as the basis for models with multiple speakers and repeated interactions.
In the frame hypothesis (CITATION), human concepts are equated with frames, which extend feature lists by a functional structure consisting of attributes and values. For example, a bachelor is represented by the attributes gender and marital status and their values ‘male’ and ‘unwed’. This paper makes the point that for many applications of concepts in cognition, including for concepts to be associated with lexemes in natural languages, the right structures to assume are not merely frames but stochastic frames in which attributes are associated with probability distributions over values. The paper introduces the idea of stochastic frames and suggests three applications: vagueness, ambiguity, and typicality.
Recent work in compositional distributional semantics showed how bialgebras model generalised quantifiers of natural language. That technique requires working with vector space over power sets of bases, and therefore is computationally costly. It is possible to overcome the computational hurdles by working with fuzzy generalised quantifiers. In this paper, we show that the compositional notion of semantics of natural language, guided by a grammar, extends from a binary to a many valued setting and instantiate in it the fuzzy computations. We import vector representations of words and predicates, learnt from large scale compositional distributional semantics, interpret them as fuzzy sets, and analyse their performance on a toy inference dataset.
Semantic frames are formal linguistic structures describing situations/actions/events, e.g. Commercial transfer of goods. Each frame provides a set of roles corresponding to the situation participants, e.g. Buyer and Goods, and lexical units (LUs) – words and phrases that can evoke this particular frame in texts, e.g. Sell. The scarcity of annotated resources hinders wider adoption of frame semantics across languages and domains. We investigate a simple yet effective method, lexical substitution with word representation models, to automatically expand a small set of frame-annotated sentences with new words for their respective roles and LUs. We evaluate the expansion quality using FrameNet. Contextualized models demonstrate overall superior performance compared to the non-contextualized ones on roles. However, the latter show comparable performance on the task of LU expansion.
At the intersection between computer vision and natural language processing, there has been recent progress on two natural language generation tasks: Dense Image Captioning and Referring Expression Generation for objects in complex scenes. The former aims to provide a caption for a specified object in a complex scene for the benefit of an interlocutor who may not be able to see it. The latter aims to produce a referring expression that will serve to identify a given object in a scene that the interlocutor can see. The two tasks are designed for different assumptions about the common ground between the interlocutors, and serve very different purposes, although they both associate a linguistic description with an object in a complex scene. Despite these fundamental differences, the distinction between these two tasks is sometimes overlooked. Here, we undertake a side-by-side comparison between image captioning and reference game human datasets and show that they differ systematically with respect to informativity. We hope that an understanding of the systematic differences among these human datasets will ultimately allow them to be leveraged more effectively in the associated engineering tasks.
Natural Language Inference models have reached almost human-level performance but their generalisation capabilities have not been yet fully characterized. In particular, sensitivity to small changes in the data is a current area of investigation. In this paper, we focus on the effect of punctuation on such models. Our findings can be broadly summarized as follows: (1) irrelevant changes in punctuation are correctly ignored by the recent transformer models (BERT) while older RNN-based models were sensitive to them. (2) All models, both transformers and RNN-based models, are incapable of taking into account small relevant changes in the punctuation.
High quality datasets for question answering exist in a few languages, but far from all. Producing such datasets for new languages requires extensive manual labour. In this work we look at different methods for using existing datasets to train question-answering models in languages lacking such datasets. We show that machine translation followed by cross-lingual projection is a viable way to create a full question-answering dataset in a new language. We introduce new methods both for bitext alignment, using optimal transport, and for direct cross-lingual projection, utilizing multilingual BERT. We show that our methods produce good Swedish question-answering models without any manual work. Finally, we apply our proposed methods on Spanish and evaluate it on the XQuAD and MLQA benchmarks where we achieve new state-of-the-art values of 80.4 F1 and 62.9 Exact Match (EM) points on the Spanish XQuAD corpus and 70.8 F1 and 53.0 EM on the Spanish MLQA corpus, showing that the technique is readily applicable to other languages.
Homonymy is often used to showcase one of the advantages of context-sensitive word embedding techniques such as ELMo and BERT. In this paper we want to shift the focus to the related but less exhaustively explored phenomenon of polysemy, where a word expresses various distinct but related senses in different contexts. Specifically, we aim to i) investigate a recent model of polyseme sense clustering proposed by Ortega-Andres & Vicente (2019) through analysing empirical evidence of word sense grouping in human similarity judgements, ii) extend the evaluation of context-sensitive word embedding systems by examining whether they encode differences in word sense similarity and iii) compare the word sense similarities of both methods to assess their correlation and gain some intuition as to how well contextualised word embeddings could be used as surrogate word sense similarity judgements in linguistic experiments.
We present ongoing research on the relationship between short-term semantic shifts and frequency change patterns by examining the case of the refugee crisis in Austria from 2015 to 2016. Our experiments are carried out on a diachronic corpus of Austrian German, namely a corpus of newspaper articles. We trace the evolution of the usage of words that represent concepts in the context of the refugee crisis by analyzing cosine similarities of word vectors over time as well as similarities based on the words’ nearest neighbourhood sets. In order to investigate how exactly the contextual meanings have changed, we measure cosine similarity between the following pairs of words: words describing the refugee crisis, on the one hand, and words indicating the process of mediatization and politicization of the refugee crisis in Austria proposed by a domain expert, on the other hand. We evaluate our approach against the expert knowledge. The paper presents the current findings and outlines the directions of the future work.