Perspectives of Turning Prague Dependency Treebank into a Knowledge Base

Václav Novák, Jan Hajič


Abstract
Recently, the Prague Dependency Treebank 2.0 (PDT 2.0) has emerged as the largest text corpora annotated on the level of tectogrammatical representation (“linguistic meaning”) described in Sgall et al. (2004) and containing about 0.8 milion words (see Hajic (2004)). We hope that this level of annotation is so close to the meaning of the utterances contained in the corpora that it should enable us to automatically transform texts contained in the corpora to the form of knowledge base, usable for information extraction, question answering, summarization, etc. We can use Multilayered Extended Semantic Networks (MultiNet) described in Helbig (2006) as the target formalism. In this paper we discuss the suitability of such approach and some of the main issues that will arise in the process. In section 1, we introduce formalisms underlying PDT 2.0 and MultiNet, in section 2. We describe the role MultiNet can play in the system of Functional Generative Description (FGD), section 3 discusses issues of automatic conversion to MultiNet and section 4 gives some conclusions.
Anthology ID:
L06-1137
Volume:
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Month:
May
Year:
2006
Address:
Genoa, Italy
Editors:
Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/245_pdf.pdf
DOI:
Bibkey:
Cite (ACL):
Václav Novák and Jan Hajič. 2006. Perspectives of Turning Prague Dependency Treebank into a Knowledge Base. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
Cite (Informal):
Perspectives of Turning Prague Dependency Treebank into a Knowledge Base (Novák & Hajič, LREC 2006)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/245_pdf.pdf