Pegah Alipoor


2020

pdf bib
Variants of Vector Space Reductions for Predicting the Compositionality of English Noun Compounds
Pegah Alipoor | Sabine Schulte im Walde
Proceedings of the Twelfth Language Resources and Evaluation Conference

Predicting the degree of compositionality of noun compounds such as “snowball” and “butterfly” is a crucial ingredient for lexicography and Natural Language Processing applications, to know whether the compound should be treated as a whole, or through its constituents, and what it means. Computational approaches for an automatic prediction typically represent and compare compounds and their constituents within a vector space and use distributional similarity as a proxy to predict the semantic relatedness between the compounds and their constituents as the compound’s degree of compositionality. This paper provides a systematic evaluation of vector-space reduction variants across kinds, exploring reductions based on part-of-speech next to and also in combination with Principal Components Analysis using Singular Value and word2vec embeddings. We show that word2vec and nouns only dimensionality reductions are the most successful and stable vector space variants for our task.