2008
pdf
bib
abs
Some Fine Points of Hybrid Natural Language Parsing
Peter Adolphs
|
Stephan Oepen
|
Ulrich Callmeier
|
Berthold Crysmann
|
Dan Flickinger
|
Bernd Kiefer
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Large-scale grammar-based parsing systems nowadays increasingly rely on independently developed, more specialized components for pre-processing their input. However, different tools make conflicting assumptions about very basic properties such as tokenization. To make linguistic annotation gathered in pre-processing available to deep parsing, a hybrid NLP system needs to establish a coherent mapping between the two universes. Our basic assumption is that tokens are best described by attribute value matrices (AVMs) that may be arbitrarily complex. We propose a powerful resource-sensitive rewrite formalism, chart mapping, that allows us to mediate between the token descriptions delivered by shallow pre-processing components and the input expected by the grammar. We furthermore propose a novel way of unknown word treatment where all generic lexical entries are instantiated that are licensed by a particular token AVM. Again, chart mapping is used to give the grammar writer full control as to which items (e.g. native vs. generic lexical items) enter syntactic parsing. We discuss several further uses of the original idea and report on early experiences with the new machinery.
2004
pdf
bib
The DeepThought Core Architecture Framework
Ulrich Callmeier
|
Andreas Eisele
|
Ulrich Schäfer
|
Melanie Siegel
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
2000
pdf
bib
abs
Measure for Measure: Parser Cross-fertilization - Towards Increased Component Comparability and Exchange
Stephan Oepen
|
Ulrich Callmeier
Proceedings of the Sixth International Workshop on Parsing Technologies
Over the past few years significant progress was accomplished in efficient processing with wide-coverage HPSG grammars. HPSG-based parsing systems are now available that can process medium-complexity sentences (of ten to twenty words, say) in average parse times equivalent to real (i.e. human reading) time. A large number of engineering improvements in current HPSG systems were achieved through collaboration of multiple research centers and mutual exchange of experience, encoding techniques, algorithms, and even pieces of software. This article presents an approach to grammar and system engineering, termed competence & performance profiling, that makes systematic experimentation and the precise empirical study of system properties a focal point in development. Adapting the profiling metaphor familiar from software engineering to constraint-based grammars and parsers, enables developers to maintain an accurate record of system evolution, identify grammar and system deficiencies quickly, and compare to earlier versions or between different systems. We discuss a number of exemplary problems that motivate the experimental approach, and apply the empirical methodology in a fairly detailed discussion of what was achieved during a development period of three years. Given the collaborative nature in setup, the empirical results we present involve research and achievements of a large group of people.
pdf
bib
Cross-Platform, Cross-Grammar Comparison – Can it be Done?
Ulrich Callmeier
|
Stephan Oepen
Proceedings of the COLING-2000 Workshop on Efficiency In Large-Scale Parsing Systems