Paul Miller


2021

pdf bib
Representation and Pre-Activation of Lexical-Semantic Knowledge in Neural Language Models
Steven Derby | Paul Miller | Barry Devereux
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

In this paper, we perform a systematic analysis of how closely the intermediate layers from LSTM and trans former language models correspond to human semantic knowledge. Furthermore, in order to make more meaningful comparisons with theories of human language comprehension in psycholinguistics, we focus on two key stages where the meaning of a particular target word may arise: immediately before the word’s presentation to the model (comparable to forward inferencing), and immediately after the word token has been input into the network. Our results indicate that the transformer models are better at capturing semantic knowledge relating to lexical concepts, both during word prediction and when retention is required.

2020

pdf bib
Analysing Word Representation from the Input and Output Embeddings in Neural Network Language Models
Steven Derby | Paul Miller | Barry Devereux
Proceedings of the 24th Conference on Computational Natural Language Learning

Researchers have recently demonstrated that tying the neural weights between the input look-up table and the output classification layer can improve training and lower perplexity on sequence learning tasks such as language modelling. Such a procedure is possible due to the design of the softmax classification layer, which previous work has shown to comprise a viable set of semantic representations for the model vocabulary, and these these output embeddings are known to perform well on word similarity benchmarks. In this paper, we make meaningful comparisons between the input and output embeddings and other SOTA distributional models to gain a better understanding of the types of information they represent. We also construct a new set of word embeddings using the output embeddings to create locally-optimal approximations for the intermediate representations from the language model. These locally-optimal embeddings demonstrate excellent performance across all our evaluations.

pdf bib
Encoding Lexico-Semantic Knowledge using Ensembles of Feature Maps from Deep Convolutional Neural Networks
Steven Derby | Paul Miller | Barry Devereux
Proceedings of the 28th International Conference on Computational Linguistics

Semantic models derived from visual information have helped to overcome some of the limitations of solely text-based distributional semantic models. Researchers have demonstrated that text and image-based representations encode complementary semantic information, which when combined provide a more complete representation of word meaning, in particular when compared with data on human conceptual knowledge. In this work, we reveal that these vision-based representations, whilst quite effective, do not make use of all the semantic information available in the neural network that could be used to inform vector-based models of semantic representation. Instead, we build image-based meta-embeddings from computer vision models, which can incorporate information from all layers of the network, and show that they encode a richer set of semantic attributes and yield a more complete representation of human conceptual knowledge.

2019

pdf bib
Feature2Vec: Distributional semantic modelling of human property knowledge
Steven Derby | Paul Miller | Barry Devereux
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Feature norm datasets of human conceptual knowledge, collected in surveys of human volunteers, yield highly interpretable models of word meaning and play an important role in neurolinguistic research on semantic cognition. However, these datasets are limited in size due to practical obstacles associated with exhaustively listing properties for a large number of words. In contrast, the development of distributional modelling techniques and the availability of vast text corpora have allowed researchers to construct effective vector space models of word meaning over large lexicons. However, this comes at the cost of interpretable, human-like information about word meaning. We propose a method for mapping human property knowledge onto a distributional semantic space, which adapts the word2vec architecture to the task of modelling concept features. Our approach gives a measure of concept and feature affinity in a single semantic space, which makes for easy and efficient ranking of candidate human-derived semantic properties for arbitrary words. We compare our model with a previous approach, and show that it performs better on several evaluation tasks. Finally, we discuss how our method could be used to develop efficient sampling techniques to extend existing feature norm datasets in a reliable way.

2018

pdf bib
Representation of Word Meaning in the Intermediate Projection Layer of a Neural Language Model
Steven Derby | Paul Miller | Brian Murphy | Barry Devereux
Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

Performance in language modelling has been significantly improved by training recurrent neural networks on large corpora. This progress has come at the cost of interpretability and an understanding of how these architectures function, making principled development of better language models more difficult. We look inside a state-of-the-art neural language model to analyse how this model represents high-level lexico-semantic information. In particular, we investigate how the model represents words by extracting activation patterns where they occur in the text, and compare these representations directly to human semantic knowledge.

pdf bib
Using Sparse Semantic Embeddings Learned from Multimodal Text and Image Data to Model Human Conceptual Knowledge
Steven Derby | Paul Miller | Brian Murphy | Barry Devereux
Proceedings of the 22nd Conference on Computational Natural Language Learning

Distributional models provide a convenient way to model semantics using dense embedding spaces derived from unsupervised learning algorithms. However, the dimensions of dense embedding spaces are not designed to resemble human semantic knowledge. Moreover, embeddings are often built from a single source of information (typically text data), even though neurocognitive research suggests that semantics is deeply linked to both language and perception. In this paper, we combine multimodal information from both text and image-based representations derived from state-of-the-art distributional models to produce sparse, interpretable vectors using Joint Non-Negative Sparse Embedding. Through in-depth analyses comparing these sparse models to human-derived behavioural and neuroimaging data, we demonstrate their ability to predict interpretable linguistic descriptions of human ground-truth semantic knowledge.