Eirik Hektoen
1997
Probabilistic Parse Selection based on Semantic Cooccurrences
Eirik Hektoen
Proceedings of the Fifth International Workshop on Parsing Technologies
This paper presents a new technique for selecting the correct parse of ambiguous sentences based on a probabilistic analysis, of lexical cooccurrences in semantic forms. The method is called “Semco” (for semantic cooccurrence analysis) and is specifically targeted at the differential distribution of such cooccurrences in correct and incorrect parses. It uses Bayesian Estimation for the cooccurrence probabilities to achieve higher accuracy for sparse data than the more common Maximum Likelihood Estimation would. It has been tested on the Wall Street Journal corpus (in the PENN Treebank) and shown to find the correct parse of 60.9% of parseable sentences of 6-20 words.