Jeremy Reffin


2017

pdf bib
When a Red Herring in Not a Red Herring: Using Compositional Methods to Detect Non-Compositional Phrases
Julie Weeds | Thomas Kober | Jeremy Reffin | David Weir
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

Non-compositional phrases such as red herring and weakly compositional phrases such as spelling bee are an integral part of natural language (Sag, 2002). They are also the phrases that are difficult, or even impossible, for good compositional distributional models of semantics. Compositionality detection therefore provides a good testbed for compositional methods. We compare an integrated compositional distributional approach, using sparse high dimensional representations, with the ad-hoc compositional approach of applying simple composition operations to state-of-the-art neural embeddings.

pdf bib
One Representation per Word - Does it make Sense for Composition?
Thomas Kober | Julie Weeds | John Wilkie | Jeremy Reffin | David Weir
Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications

In this paper, we investigate whether an a priori disambiguation of word senses is strictly necessary or whether the meaning of a word in context can be disambiguated through composition alone. We evaluate the performance of off-the-shelf single-vector and multi-sense vector models on a benchmark phrase similarity task and a novel task for word-sense discrimination. We find that single-sense vector models perform as well or better than multi-sense vector models despite arguably less clean elementary representations. Our findings furthermore show that simple composition functions such as pointwise addition are able to recover sense specific information from a single-sense vector model remarkably well.

pdf bib
Improving Semantic Composition with Offset Inference
Thomas Kober | Julie Weeds | Jeremy Reffin | David Weir
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Count-based distributional semantic models suffer from sparsity due to unobserved but plausible co-occurrences in any text collection. This problem is amplified for models like Anchored Packed Trees (APTs), that take the grammatical type of a co-occurrence into account. We therefore introduce a novel form of distributional inference that exploits the rich type structure in APTs and infers missing data by the same mechanism that is used for semantic composition.

2016

pdf bib
Improving Sparse Word Representations with Distributional Inference for Semantic Composition
Thomas Kober | Julie Weeds | Jeremy Reffin | David Weir
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
A critique of word similarity as a method for evaluating distributional semantic models
Miroslav Batchkarov | Thomas Kober | Jeremy Reffin | Julie Weeds | David Weir
Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP

pdf bib
Aligning Packed Dependency Trees: A Theory of Composition for Distributional Semantics
David Weir | Julie Weeds | Jeremy Reffin | Thomas Kober
Computational Linguistics, Volume 42, Issue 4 - December 2016

2014

pdf bib
Distributional Composition using Higher-Order Dependency Vectors
Julie Weeds | David Weir | Jeremy Reffin
Proceedings of the 2nd Workshop on Continuous Vector Space Models and their Compositionality (CVSC)

pdf bib
Learning to Distinguish Hypernyms and Co-Hyponyms
Julie Weeds | Daoud Clarke | Jeremy Reffin | David Weir | Bill Keller
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Method51 for Mining Insight from Social Media Datasets
Simon Wibberley | David Weir | Jeremy Reffin
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: System Demonstrations

2013

pdf bib
Language Technology for Agile Social Media Science
Simon Wibberley | David Weir | Jeremy Reffin
Proceedings of the 7th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities