Katerina Veselá

Also published as: Kateřina Veselá


2004

pdf bib
Condition of Projectivity in the Underlying Dependency Structures
Kateřina Veselá | Jiří Havelka | Eva Hajičová
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

The claim made in this paper is that in a formal description of language, it is possible and useful to work with dependency-based underlying representations of sentences (tectogrammatical representations) meeting the condition of projectivity. The reasons for the inclusion of this condition into the definition of the tectogrammatical representations are both formally and empirically sound (Section 1). An analysis of the material offered by the Prague Dependency Treebank with annotations of the underlying syntactic structure of sentences (described in Section 2) has led to an interesting classification of non-projective constructions in Czech (Section 3). It documents that most (types of) constructions that appear to be non-projective in the surface shape of sentences can be described by means of projective trees. The realization of the surface word order (with the use of movement rules) is then relegated to the morphemic level, where the representation of the sentence has the shape of a string rather than a tree.

pdf bib
Annotators’ Agreement: The Case of Topic-Focus Articulation
Kateřina Veselá | Jiří Havelka | Eva Hajičová
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

The annotation of the Prague Dependency Treebank (PDT) is conceived of as a multilayered scenario that comprises also dependency representations (tectogrammatical tree structures, TGTS's) of the underlying structure of the sentences. TGTS's capture three basic aspects of the underlying structure of sentences: (a) the dependency tree structure, (b) the kinds of dependency syntactic relations, and (c) the basic characteristics of the topic-focus articulation (TFA). Since the PDT is a large collection and the annotations on the deepest layer are to a large extent performed by several human annotators (based on an automatic preprocessing module), it is more than necessary to observe the consistence of annotators and the agreement among them. In the present paper, we summarize the results of the evaluation of parallel annotations of several samples taken from PDT and the measures accepted to improve the consistency of annotations.