Parsing Without Grammar

Shinsuke Mori, Makoto Nagao


Abstract
We describe and evaluate experimentally a method to parse a tagged corpus without grammar modeling a natural language on context-free language. This method is based on the following three hypotheses. 1) Part-of-speech sequences on the right-hand side of a rewriting rule are less constrained as to what part-of-speech precedes and follows them than non-constituent sequences. 2) Part-of-speech sequences directly derived from the same non-terminal symbol have similar environments. 3) The most suitable set of rewriting rules makes the greatest reduction of the corpus size. Based on these hypotheses, the system finds a set of constituent-like part-of-speech sequences and replaces them with a new symbol. The repetition of these processes brings us a set of rewriting rules, a grammar, and the bracketed corpus.
Anthology ID:
1995.iwpt-1.22
Volume:
Proceedings of the Fourth International Workshop on Parsing Technologies
Month:
September 20-24
Year:
1995
Address:
Prague and Karlovy Vary, Czech Republic
Editors:
Eva Hajicova, Bernard Lang, Robert Berwick, Harry Bunt, Bob Carpenter, Ken Church, Aravind Joshi, Ronald Kaplan, Martin Kay, Makoto Nagao, Anton Nijholt, Mark Steedman, Henry Thompson, Masaru Tomita, K. Vijay-Shanker, Yorick Wilks, Kent Wittenburg
Venues:
IWPT | WS
SIG:
SIGPARSE
Publisher:
Association for Computational Linguistics
Note:
Pages:
174–185
Language:
URL:
https://aclanthology.org/1995.iwpt-1.22
DOI:
Bibkey:
Cite (ACL):
Shinsuke Mori and Makoto Nagao. 1995. Parsing Without Grammar. In Proceedings of the Fourth International Workshop on Parsing Technologies, pages 174–185, Prague and Karlovy Vary, Czech Republic. Association for Computational Linguistics.
Cite (Informal):
Parsing Without Grammar (Mori & Nagao, IWPT-WS 1995)
Copy Citation:
PDF:
https://aclanthology.org/1995.iwpt-1.22.pdf