Probabilistic Parse Selection based on Semantic Cooccurrences

Eirik Hektoen


Abstract
This paper presents a new technique for selecting the correct parse of ambiguous sentences based on a probabilistic analysis, of lexical cooccurrences in semantic forms. The method is called “Semco” (for semantic cooccurrence analysis) and is specifically targeted at the differential distribution of such cooccurrences in correct and incorrect parses. It uses Bayesian Estimation for the cooccurrence probabilities to achieve higher accuracy for sparse data than the more common Maximum Likelihood Estimation would. It has been tested on the Wall Street Journal corpus (in the PENN Treebank) and shown to find the correct parse of 60.9% of parseable sentences of 6-20 words.
Anthology ID:
1997.iwpt-1.15
Volume:
Proceedings of the Fifth International Workshop on Parsing Technologies
Month:
September 17-20
Year:
1997
Address:
Boston/Cambridge, Massachusetts, USA
Editors:
Anton Nijholt, Robert C. Berwick, Harry C. Bunt, Bob Carpenter, Eva Hajicova, Mark Johnson, Aravind Joshi, Ronald Kaplan, Martin Kay, Bernard Lang, Alon Lavie, Makoto Nagao, Mark Steedman, Masaru Tomita, K. Vijay-Shanker, David Weir, Kent Wittenburg, Mats Wiren
Venue:
IWPT
SIG:
SIGPARSE
Publisher:
Association for Computational Linguistics
Note:
Pages:
113–122
Language:
URL:
https://aclanthology.org/1997.iwpt-1.15
DOI:
Bibkey:
Cite (ACL):
Eirik Hektoen. 1997. Probabilistic Parse Selection based on Semantic Cooccurrences. In Proceedings of the Fifth International Workshop on Parsing Technologies, pages 113–122, Boston/Cambridge, Massachusetts, USA. Association for Computational Linguistics.
Cite (Informal):
Probabilistic Parse Selection based on Semantic Cooccurrences (Hektoen, IWPT 1997)
Copy Citation:
PDF:
https://aclanthology.org/1997.iwpt-1.15.pdf