Identifying semantic argument types in predication contexts is not a straightforward task for several reasons, such as inherent polysemy, coercion, and copredication phenomena. In this paper, we train monolingual and multilingual classifiers with a zero-shot cross-lingual approach to identify semantic argument types in predications using pre-trained language models as feature extractors. We train classifiers for different semantic argument types and for both verbal and adjectival predications. Furthermore, we propose a method to detect copredication using these classifiers through identifying the argument semantic type targeted in different predications over the same noun in a sentence. We evaluate the performance of the method on copredication test data with Food•Event nouns for 5 languages.
The paper presents a frame-based model of inherently polysemous nouns (such as ‘book’, which denotes both a physical object and an informational content) in which the meaning facets are directly accessible via attributes and which also takes into account the semantic relations between the facets. Predication over meaning facets (as in ‘memorize the book’) is then modeled as targeting the value of the corresponding facet attribute while coercion (as in ‘finish the book’) is modeled via specific patterns that enrich the predication. We use a compositional framework whose basic components are lexicalized syntactic trees paired with semantic frames and in which frame unification is triggered by tree composition. The approach is applied to a variety of combinations of predications over meaning facets and coercions.
This paper describes the first release of RRGparbank, a multilingual parallel treebank for Role and Reference Grammar (RRG) containing annotations of George Orwell’s novel 1984 and its translations. The release comprises the entire novel for English and a constructionally diverse and highly parallel sample (“seed”) for German, French and Russian. The paper gives an overview of annotation decisions that have been taken and describes the adopted treebanking methodology. Finally, as a possible application, a multilingual parser is trained on the treebank data. RRGparbank is one of the first resources to apply RRG to large amounts of real-world data. Furthermore, it enables comparative and typological corpus studies in RRG. And, finally, it creates new possibilities of data-driven NLP applications based on RRG.
We describe the treatment of verbs with prepositional complements inHaGenLex, a semantically based computer lexicon for German.Prepositional verbs such as bestehen auf (insist on) subcategorize for a prepositional phrase where the preposition usually has no independent meaning of its own. The lexical semantic information inHaGenLex is specified by means of MultiNet, a full-fledged knowledge representation formalism, which proves to be particularly useful for representing the semantics of verbs with prepositional complements. We indicate how the semantic representation in HaGenLex can be used to define semantic classes of prepositional verbs and briefly discuss the relation of these classes to Levin's verb classes. Moreover, wepresent first results on the automatic identification of prepositionalverbs by corpus-based methods.