Linguistic Issues in Language Technology (2017)
In this paper, we examine the correlation between lexical semantics and the syntactic realization of the different components of a word’s meaning in natural language. More specifically, we will explore the effect that lexical factorization in verb semantics has on the suppression or expression of semantic features within the sentence. Factorization was a common analytic tool employed in early generative linguistic approaches to lexical decomposition, and continues to play a role in contemporary semantics, in various guises and modified forms. Building on the unpublished analysis of verbs of seeing in Joshi (1972), we argue here that the significance of lexical factorization is twofold: first, current models of verb meaning owe much of their insight to factor-based theories of meaning; secondly, the factorization properties of a lexical item appear to influence, both directly and indirectly, the possible syntactic expressibility of arguments and adjuncts in sentence composition. We argue that this information can be used to compute what we call the factor expression likelihood (FEL) associated with a verb in a sentence. This is the likelihood that the overt syntactic expression of a factor will cooccur with the verb. This has consequences for the compositional mechanisms responsible for computing the meaning of the sentence, as well as significance in the creation of computational models attempting to capture linguistic behavior over large corpora.
We consider the extent to which different deep neural network (DNN) configurations can learn syntactic relations, by taking up Linzen et al.’s (2016) work on subject-verb agreement with LSTM RNNs. We test their methods on a much larger corpus than they used (a ⇠24 million example part of the WaCky corpus, instead of their ⇠1.35 million example corpus, both drawn from Wikipedia). We experiment with several different DNN architectures (LSTM RNNs, GRUs, and CNNs), and alternative parameter settings for these systems (vocabulary size, training to test ratio, number of layers, memory size, drop out rate, and lexical embedding dimension size). We also try out our own unsupervised DNN language model. Our results are broadly compatible with those that Linzen et al. report. However, we discovered some interesting, and in some cases, surprising features of DNNs and language models in their performance of the agreement learning task. In particular, we found that DNNs require large vocabularies to form substantive lexical embeddings in order to learn structural patterns. This finding has interesting consequences for our understanding of the way in which DNNs represent syntactic information. It suggests that DNNs learn syntactic patterns more efficiently through rich lexical embeddings, with semantic as well as syntactic cues, than from training on lexically impoverished strings that highlight structural patterns.
This paper presents a new classification of verbs of change and modification, proposing a dynamic interpretation of the lexical semantics of the predicate and its arguments. Adopting the model of dynamic event structure proposed in Pustejovsky (2013), and extending the model of dynamic selection outlined in Pustejovsky and Jezek (2011), we define a verb class in terms of its Dynamic Argument Structure (DAS), a representation which encodes how the participants involved in the change behave as the event unfolds. We address how the logical resources and results of change predicates are realized syntactically, if at all, as well as how the exploitation of the resource results in the initiation or termination of a new object, i.e. the result. We show how DAS can be associated with a dynamically encoded event structure representation, which measures the change making reference to a scalar component, modelled in terms of assignment and/or testing of values of attributes of participants.