2018
pdf
bib
abs
Analogies in Complex Verb Meaning Shifts: the Effect of Affect in Semantic Similarity Models
Maximilian Köper
|
Sabine Schulte im Walde
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)
We present a computational model to detect and distinguish analogies in meaning shifts between German base and complex verbs. In contrast to corpus-based studies, a novel dataset demonstrates that “regular” shifts represent the smallest class. Classification experiments relying on a standard similarity model successfully distinguish between four types of shifts, with verb classes boosting the performance, and affective features for abstractness, emotion and sentiment representing the most salient indicators.
pdf
bib
abs
Combining Abstractness and Language-specific Theoretical Indicators for Detecting Non-Literal Usage of Estonian Particle Verbs
Eleri Aedmaa
|
Maximilian Köper
|
Sabine Schulte im Walde
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop
This paper presents two novel datasets and a random-forest classifier to automatically predict literal vs. non-literal language usage for a highly frequent type of multi-word expression in a low-resource language, i.e., Estonian. We demonstrate the value of language-specific indicators induced from theoretical linguistic research, which outperform a high majority baseline when combined with language-independent features of non-literal language (such as abstractness).
pdf
bib
abs
Assessing Meaning Components in German Complex Verbs: A Collection of Source-Target Domains and Directionality
Sabine Schulte im Walde
|
Maximilian Köper
|
Sylvia Springorum
Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics
This paper presents a collection to assess meaning components in German complex verbs, which frequently undergo meaning shifts. We use a novel strategy to obtain source and target domain characterisations via sentence generation rather than sentence annotation. A selection of arrows adds spatial directional information to the generated contexts. We provide a broad qualitative description of the dataset, and a series of standard classification experiments verifies the quantitative reliability of the presented resource. The setup for collecting the meaning components is applicable also to other languages, regarding complex verbs as well as other language-specific targets that involve meaning shifts.
pdf
bib
abs
Integrating Predictions from Neural-Network Relation Classifiers into Coreference and Bridging Resolution
Ina Roesiger
|
Maximilian Köper
|
Kim Anh Nguyen
|
Sabine Schulte im Walde
Proceedings of the First Workshop on Computational Models of Reference, Anaphora and Coreference
Cases of coreference and bridging resolution often require knowledge about semantic relations between anaphors and antecedents. We suggest state-of-the-art neural-network classifiers trained on relation benchmarks to predict and integrate likelihoods for relations. Two experiments with representations differing in noise and complexity improve our bridging but not our coreference resolver.
2017
pdf
bib
abs
Hierarchical Embeddings for Hypernymy Detection and Directionality
Kim Anh Nguyen
|
Maximilian Köper
|
Sabine Schulte im Walde
|
Ngoc Thang Vu
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
We present a novel neural model HyperVec to learn hierarchical embeddings for hypernymy detection and directionality. While previous embeddings have shown limitations on prototypical hypernyms, HyperVec represents an unsupervised measure where embeddings are learned in a specific order and capture the hypernym–hyponym distributional hierarchy. Moreover, our model is able to generalize over unseen hypernymy pairs, when using only small sets of training data, and by mapping to other languages. Results on benchmark datasets show that HyperVec outperforms both state-of-the-art unsupervised measures and embedding models on hypernymy detection and directionality, and on predicting graded lexical entailment.
pdf
bib
abs
Applying Multi-Sense Embeddings for German Verbs to Determine Semantic Relatedness and to Detect Non-Literal Language
Maximilian Köper
|
Sabine Schulte im Walde
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers
Up to date, the majority of computational models still determines the semantic relatedness between words (or larger linguistic units) on the type level. In this paper, we compare and extend multi-sense embeddings, in order to model and utilise word senses on the token level. We focus on the challenging class of complex verbs, and evaluate the model variants on various semantic tasks: semantic classification; predicting compositionality; and detecting non-literal language usage. While there is no overall best model, all models significantly outperform a word2vec single-sense skip baseline, thus demonstrating the need to distinguish between word senses in a distributional semantic model.
pdf
bib
abs
Complex Verbs are Different: Exploring the Visual Modality in Multi-Modal Models to Predict Compositionality
Maximilian Köper
|
Sabine Schulte im Walde
Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)
This paper compares a neural network DSM relying on textual co-occurrences with a multi-modal model integrating visual information. We focus on nominal vs. verbal compounds, and zoom into lexical, empirical and perceptual target properties to explore the contribution of the visual modality. Our experiments show that (i) visual features contribute differently for verbs than for nouns, and (ii) images complement textual information, if (a) the textual modality by itself is poor and appropriate image subsets are used, or (b) the textual modality by itself is rich and large (potentially noisy) images are added.
pdf
bib
abs
Improving Verb Metaphor Detection by Propagating Abstractness to Words, Phrases and Individual Senses
Maximilian Köper
|
Sabine Schulte im Walde
Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications
Abstract words refer to things that can not be seen, heard, felt, smelled, or tasted as opposed to concrete words. Among other applications, the degree of abstractness has been shown to be a useful information for metaphor detection. Our contribution to this topic are as follows: i) we compare supervised techniques to learn and extend abstractness ratings for huge vocabularies ii) we learn and investigate norms for larger units by propagating abstractness to verb-noun pairs which lead to better metaphor detection iii) we overcome the limitation of learning a single rating per word and show that multi-sense abstractness ratings are potentially useful for metaphor detection. Finally, with this paper we publish automatically created abstractness norms for 3million English words and multi-words as well as automatically created sense specific abstractness ratings
pdf
bib
abs
IMS at EmoInt-2017: Emotion Intensity Prediction with Affective Norms, Automatically Extended Resources and Deep Learning
Maximilian Köper
|
Evgeny Kim
|
Roman Klinger
Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis
Our submission to the WASSA-2017 shared task on the prediction of emotion intensity in tweets is a supervised learning method with extended lexicons of affective norms. We combine three main information sources in a random forrest regressor, namely (1), manually created resources, (2) automatically extended lexicons, and (3) the output of a neural network (CNN-LSTM) for sentence regression. All three feature sets perform similarly well in isolation (≈ .67 macro average Pearson correlation). The combination achieves .72 on the official test set (ranked 2nd out of 22 participants). Our analysis reveals that performance is increased by providing cross-emotional intensity predictions. The automatic extension of lexicon features benefit from domain specific embeddings. Complementary ratings for affective norms increase the impact of lexicon features. Our resources (ratings for 1.6 million twitter specific words) and our implementation is publicly available at
http://www.ims.uni-stuttgart.de/data/ims_emoint.
pdf
bib
Exploring Soft-Clustering for German (Particle) Verbs across Frequency Ranges
Moritz Wittmann
|
Maximilian Köper
|
Sabine Schulte im Walde
Proceedings of the 12th International Conference on Computational Semantics (IWCS) — Short papers
pdf
bib
Exploring Multi-Modal Text+Image Models to Distinguish between Abstract and Concrete Nouns
Sai Abishek Bhaskar
|
Maximilian Köper
|
Sabine Schulte Im Walde
|
Diego Frassinelli
Proceedings of the IWCS workshop on Foundations of Situated and Multimodal Communication
2016
pdf
bib
abs
Visualisation and Exploration of High-Dimensional Distributional Features in Lexical Semantic Classification
Maximilian Köper
|
Melanie Zaiß
|
Qi Han
|
Steffen Koch
|
Sabine Schulte im Walde
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Vector space models and distributional information are widely used in NLP. The models typically rely on complex, high-dimensional objects. We present an interactive visualisation tool to explore salient lexical-semantic features of high-dimensional word objects and word similarities. Most visualisation tools provide only one low-dimensional map of the underlying data, so they are not capable of retaining the local and the global structure. We overcome this limitation by providing an additional trust-view to obtain a more realistic picture of the actual object distances. Additional tool options include the reference to a gold standard classification, the reference to a cluster analysis as well as listing the most salient (common) features for a selected subset of the words.
pdf
bib
abs
Automatically Generated Affective Norms of Abstractness, Arousal, Imageability and Valence for 350 000 German Lemmas
Maximilian Köper
|
Sabine Schulte im Walde
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
This paper presents a collection of 350,000 German lemmatised words, rated on four psycholinguistic affective attributes. All ratings were obtained via a supervised learning algorithm that can automatically calculate a numerical rating of a word. We applied this algorithm to abstractness, arousal, imageability and valence. Comparison with human ratings reveals high correlation across all rating types. The full resource is publically available at:
http://www.ims.uni-stuttgart.de/data/affective_norms/pdf
bib
Distinguishing Literal and Non-Literal Usage of German Particle Verbs
Maximilian Köper
|
Sabine Schulte im Walde
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
pdf
bib
Automatic Semantic Classification of German Preposition Types: Comparing Hard and Soft Clustering Approaches across Features
Maximilian Köper
|
Sabine Schulte im Walde
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
pdf
bib
Improving Zero-Shot-Learning for German Particle Verbs by using Training-Space Restrictions and Local Scaling
Maximilian Köper
|
Sabine Schulte im Walde
|
Max Kisselew
|
Sebastian Padó
Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics
2015
pdf
bib
Multilingual Reliability and “Semantic” Structure of Continuous Word Spaces
Maximilian Köper
|
Christian Scheible
|
Sabine Schulte im Walde
Proceedings of the 11th International Conference on Computational Semantics
2014
pdf
bib
abs
A Rank-based Distance Measure to Detect Polysemy and to Determine Salient Vector-Space Features for German Prepositions
Maximilian Köper
|
Sabine Schulte im Walde
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
This paper addresses vector space models of prepositions, a notoriously ambiguous word class. We propose a rank-based distance measure to explore the vector-spatial properties of the ambiguous objects, focusing on two research tasks: (i) to distinguish polysemous from monosemous prepositions in vector space; and (ii) to determine salient vector-space features for a classification of preposition senses. The rank-based measure predicts the polysemy vs. monosemy of prepositions with a precision of up to 88%, and suggests preposition-subcategorised nouns as more salient preposition features than preposition-subcategorising verbs.
pdf
bib
abs
Fuzzy V-Measure - An Evaluation Method for Cluster Analyses of Ambiguous Data
Jason Utt
|
Sylvia Springorum
|
Maximilian Köper
|
Sabine Schulte im Walde
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
This paper discusses an extension of the V-measure (Rosenberg and Hirschberg, 2007), an entropy-based cluster evaluation metric. While the original work focused on evaluating hard clusterings, we introduce the Fuzzy V-measure which can be used on data that is inherently ambiguous. We perform multiple analyses varying the sizes and ambiguity rates and show that while entropy-based measures in general tend to suffer when ambiguity increases, a measure with desirable properties can be derived from these in a straightforward manner.