Automated Deep Lexical Acquisition for Robust Open Texts Processing

Yi Zhang, Valia Kordoni


Abstract
In this paper, we report on methods to detect and repair lexical errors for deep grammars. The lack of coverage has for long been the major problem for deep processing. The existence of various errors in the hand-crafted large grammars prevents their usage in real applications. The manual detection and repair of errors requires asignificant amount of human effort. An experiment with the British National Corpus shows about 70% of the sentences contain unknownword(s) for the English Resource Grammar. With the help of error mining methods, many lexical errors are discovered, which cause a large part of the parsing failures. Moreover, with a lexical type predictor based on a maximum entropy model, new lexical entries are automatically generated. The contribution of various features for the model is evaluated. With the disambiguated full parsing results, the precision of the predictor is enhanced significantly.
Anthology ID:
L06-1277
Volume:
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Month:
May
Year:
2006
Address:
Genoa, Italy
Editors:
Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/464_pdf.pdf
DOI:
Bibkey:
Cite (ACL):
Yi Zhang and Valia Kordoni. 2006. Automated Deep Lexical Acquisition for Robust Open Texts Processing. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
Cite (Informal):
Automated Deep Lexical Acquisition for Robust Open Texts Processing (Zhang & Kordoni, LREC 2006)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/464_pdf.pdf