Yana Strakatova
2021
Automatic Classification of Attributes in German Adjective-Noun Phrases
Neele Falk
|
Yana Strakatova
|
Eva Huber
|
Erhard Hinrichs
Proceedings of the 14th International Conference on Computational Semantics (IWCS)
Adjectives such as heavy (as in heavy rain) and windy (as in windy day) provide possible values for the attributes intensity and climate, respectively. The attributes themselves are not overtly realized and are in this sense implicit. While these attributes can be easily inferred by humans, their automatic classification poses a challenging task for computational models. We present the following contributions: (1) We gain new insights into the attribute selection task for German. More specifically, we develop computational models for this task that are able to generalize to unseen data. Moreover, we show that classification accuracy depends, inter alia, on the degree of polysemy of the lexemes involved, on the generalization potential of the training data and on the degree of semantic transparency of the adjective-noun pairs in question. (2) We provide the first resource for computational and linguistic experiments with German adjective-noun pairs that can be used for attribute selection and related tasks. In order to safeguard against unwelcome memorization effects, we present an automatic data augmentation method based on a lexical resource that can increase the size of the training data to a large extent.
2020
All That Glitters is Not Gold: A Gold Standard of Adjective-Noun Collocations for German
Yana Strakatova
|
Neele Falk
|
Isabel Fuhrmann
|
Erhard Hinrichs
|
Daniela Rossmann
Proceedings of the Twelfth Language Resources and Evaluation Conference
In this paper we present the GerCo dataset of adjective-noun collocations for German, such as alter Freund ‘old friend’ and tiefe Liebe ‘deep love’. The annotation has been performed by experts based on the annotation scheme introduced in this paper. The resulting dataset contains 4,732 positive and negative instances of collocations and covers all the 16 semantic classes of adjectives as defined in the German wordnet GermaNet. The dataset can serve as a reliable empirical basis for comparing different theoretical frameworks concerned with collocations or as material for data-driven approaches to the studies of collocations including different machine learning experiments. This paper addresses the latter issue by using the GerCo dataset for evaluating different models on the task of automatic collocation identification. We compare lexical association measures with static and contextualized word embeddings. The experiments show that word embeddings outperform methods based on statistical association measures by a wide margin.
2019
Semantic Modelling of Adjective-Noun Collocations Using FrameNet
Yana Strakatova
|
Erhard Hinrichs
Proceedings of the Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019)
In this paper we argue that Frame Semantics (Fillmore, 1982) provides a good framework for semantic modelling of adjective-noun collocations. More specifically, the notion of a frame is rich enough to account for nouns from different semantic classes and to model semantic relations that hold between an adjective and a noun in terms of Frame Elements. We have substantiated these findings by considering a sample of adjective-noun collocations from German such as “enger Freund” ‘close friend’ and “starker Regen” ‘heavy rain’. The data sample is taken from different semantic fields identified in the German wordnet GermaNet (Hamp and Feldweg, 1997; Henrich and Hinrichs, 2010). The study is based on the electronic dictionary DWDS (Klein and Geyken, 2010) and uses the collocation extraction tool Wortprofil (Geyken et al., 2009). The FrameNet modelling is based on the online resource available at http://framenet.icsi.berkeley.edu. Since FrameNets are available for a range of typologically different languages, it is feasible to extend the current case study to other languages.