GLR* – An Efficient Noise-skipping Parsing Algorithm For Context Free Grammars

Alon Lavie, Masaru Tomita


Abstract
This paper describes GLR*, a parser that can parse any input sentence by ignoring unrecognizable parts of the sentence. In case the standard parsing procedure fails to parse an input sentence, the parser nondeterministically skips some word(s) in the sentence, and returns the parse with fewest skipped words. Therefore, the parser will return some parse(s) with any input sentence, unless no part of the sentence can be recognized at all. The problem can be defined in the following way: Given a context-free grammar G and a sentence S, find and parse S' – the largest subset of words of S, such that S' ∈ L(G). The algorithm described in this paper is a modification of the Generalized LR (Tomita) parsing algorithm [Tomita, 1986] . The parser accommodates the skipping of words by allowing shift operations to be performed from inactive state nodes of the Graph Structured Stack. A heuristic similar to beam search makes the algorithm computationally tractable. There have been several other approaches to the problem of robust parsing, most of which are special purpose algorithms [Carbonell and Hayes, 1984] , [Ward, 1991] and others. Because our approach is a modification to a standard context-free parsing algorithm, all the techniques and grammars developed for the standard parser can be applied as they are. Also, in case the input sentence is by itself grammatical, our parser behaves exactly as the standard GLR parser. The modified parser, GLR*, has been implemented and integrated with the latest version of the Generalized LR Parser/Compiler [Tomita et al , 1988], [Tomita, 1990]. We discuss an application of the GLR* parser to spontaneous speech understanding and present some preliminary tests on the utility of the GLR* parser in such settings.
Anthology ID:
1993.iwpt-1.12
Volume:
Proceedings of the Third International Workshop on Parsing Technologies
Month:
August 10-13
Year:
1993
Address:
Tilburg, Netherlands and Durbuy, Belgium
Editors:
Harry Bunt, Robert Berwick, Ken Church, Aravind Joshi, Ronald Kaplan, Martin Kay, Bernard Lang, Makoto Nagao, Anton Nijholt, Mark Steedman, Henry Thompson, Masaru Tomita, K. Vijay-Shanker, Yorick Wilks, Kent Wittenburg
Venue:
IWPT
SIG:
SIGPARSE
Publisher:
Association for Computational Linguistics
Note:
Pages:
123–134
Language:
URL:
https://aclanthology.org/1993.iwpt-1.12
DOI:
Bibkey:
Cite (ACL):
Alon Lavie and Masaru Tomita. 1993. GLR* – An Efficient Noise-skipping Parsing Algorithm For Context Free Grammars. In Proceedings of the Third International Workshop on Parsing Technologies, pages 123–134, Tilburg, Netherlands and Durbuy, Belgium. Association for Computational Linguistics.
Cite (Informal):
GLR* – An Efficient Noise-skipping Parsing Algorithm For Context Free Grammars (Lavie & Tomita, IWPT 1993)
Copy Citation:
PDF:
https://aclanthology.org/1993.iwpt-1.12.pdf