Sandra Wellinghoff
2006
Automated detection and annotation of term definitions in German text corpora
Angelika Storrer
|
Sandra Wellinghoff
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
We describe an approach to automatically detect and annotate definitions for technical terms in German text corpora. This approach focuses on verbs that typically appear in definitions (= definitor verbs). We specify search patterns based on the valency frames of these definitor verbs and use them (1) to detect and delimit text segments containing definitions and (2) to annotate their main functional components: the definiendum (the term that is defined) and the definiens (meaning postulates for this term). On the basis of these annotations we aim at automatically extracting WordNet-style semantic relations that hold between the head nouns of the definiendum and the head nouns of the definiens. In this paper, we will describe our annotation scheme for definitions and report on two studies: (1) a pilot study that evaluates our definition extraction approach using a German corpus with manually annotated definitions as a gold standard. (2) A feasibility study that evaluates the possibility to extract hypernym, hyponym and holonym relations from these annotated definitions.