Towards a Semi-Automatic Detection of Reflexive and Reciprocal Constructions and Their Representation in a Valency Lexicon
Václava Kettnerová | Marketa Lopatkova | Anna Vernerová | Petra Barancikova
Proceedings of the 12th Language Resources and Evaluation Conference

Valency lexicons usually describe valency behavior of verbs in non-reflexive and non-reciprocal constructions. However, reflexive and reciprocal constructions are common morphosyntactic forms of verbs. Both of these constructions are characterized by regular changes in morphosyntactic properties of verbs, thus they can be described by grammatical rules. On the other hand, the possibility to create reflexive and/or reciprocal constructions cannot be trivially derived from the morphosyntactic structure of verbs as it is conditioned by their semantic properties as well. A large-coverage valency lexicon allowing for rule based generation of all well formed verb constructions should thus integrate the information on reflexivity and reciprocity. In this paper, we propose a semi-automatic procedure, based on grammatical constraints on reflexivity and reciprocity, detecting those verbs that form reflexive and reciprocal constructions in corpus data. However, exploitation of corpus data for this purpose is complicated due to the diverse functions of reflexive markers crossing the domain of reflexivity and reciprocity. The list of verbs identified by the previous procedure is thus further used in an automatic experiment, applying word embeddings for detecting semantically similar verbs. These candidate verbs have been manually verified and annotation of their reflexive and reciprocal constructions has been integrated into the valency lexicon of Czech verbs VALLEX.


Reflexives in Czech from a Dependency Perspective
Vaclava Kettnerova | Marketa Lopatkova
Proceedings of the Fifth International Conference on Dependency Linguistics (Depling, SyntaxFest 2019)


CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies
Daniel Zeman | Martin Popel | Milan Straka | Jan Hajič | Joakim Nivre | Filip Ginter | Juhani Luotolahti | Sampo Pyysalo | Slav Petrov | Martin Potthast | Francis Tyers | Elena Badmaeva | Memduh Gokirmak | Anna Nedoluzhko | Silvie Cinková | Jan Hajič jr. | Jaroslava Hlaváčová | Václava Kettnerová | Zdeňka Urešová | Jenna Kanerva | Stina Ojala | Anna Missilä | Christopher D. Manning | Sebastian Schuster | Siva Reddy | Dima Taji | Nizar Habash | Herman Leung | Marie-Catherine de Marneffe | Manuela Sanguinetti | Maria Simi | Hiroshi Kanayama | Valeria de Paiva | Kira Droganova | Héctor Martínez Alonso | Çağrı Çöltekin | Umut Sulubacak | Hans Uszkoreit | Vivien Macketanz | Aljoscha Burchardt | Kim Harris | Katrin Marheinecke | Georg Rehm | Tolga Kayadelen | Mohammed Attia | Ali Elkahky | Zhuoran Yu | Emily Pitler | Saran Lertpradit | Michael Mandl | Jesse Kirchner | Hector Fernandez Alcalde | Jana Strnadová | Esha Banerjee | Ruli Manurung | Antonio Stella | Atsuko Shimada | Sookyoung Kwak | Gustavo Mendonça | Tatiana Lando | Rattima Nitisaroj | Josie Li
Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

The Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their learning systems on the same data sets. In 2017, the task was devoted to learning dependency parsers for a large number of languages, in a real-world setting without any gold-standard annotation on input. All test sets followed a unified annotation scheme, namely that of Universal Dependencies. In this paper, we define the task and evaluation methodology, describe how the data sets were prepared, report and analyze the main results, and provide a brief categorization of the different approaches of the participating systems.

ParaDi: Dictionary of Paraphrases of Czech Complex Predicates with Light Verbs
Petra Barančíková | Václava Kettnerová
Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)

We present a new freely available dictionary of paraphrases of Czech complex predicates with light verbs, ParaDi. Candidates for single predicative paraphrases of selected complex predicates have been extracted automatically from large monolingual data using word2vec. They have been manually verified and further refined. We demonstrate one of many possible applications of ParaDi in an experiment with improving machine translation quality.


Alternations: From Lexicon to Grammar And Back Again
Markéta Lopatková | Václava Kettnerová
Proceedings of the Workshop on Grammar and Lexicon: interactions and interfaces (GramLex)

An excellent example of a phenomenon bridging a lexicon and a grammar is provided by grammaticalized alternations (e.g., passivization, reflexivity, and reciprocity): these alternations represent productive grammatical processes which are, however, lexically determined. While grammaticalized alternations keep lexical meaning of verbs unchanged, they are usually characterized by various changes in their morphosyntactic structure. In this contribution, we demonstrate on the example of reciprocity and its representation in the valency lexicon of Czech verbs, VALLEX how a linguistic description of complex (and still systemic) changes characteristic of grammaticalized alternations can benefit from an integration of grammatical rules into a valency lexicon. In contrast to other types of grammaticalized alternations, reciprocity in Czech has received relatively little attention although it closely interacts with various linguistic phenomena (e.g., with light verbs, diatheses, and reflexivity).

Distribution of Valency Complements in Czech Complex Predicates: Between Verb and Noun
Václava Kettnerová | Eduard Bejček
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In this paper, we focus on Czech complex predicates formed by a light verb and a predicative noun expressed as the direct object. Although Czech ― as an inflectional language encoding syntactic relations via morphological cases ― provides an excellent opportunity to study the distribution of valency complements in the syntactic structure with complex predicates, this distribution has not been described so far. On the basis of a manual analysis of the richly annotated data from the Prague Dependency Treebank, we thus formulate principles governing this distribution. In an automatic experiment, we verify these principles on well-formed syntactic structures from the Prague Dependency Treebank and the Prague Czech-English Dependency Treebank with very satisfactory results: the distribution of 97% of valency complements in the surface structure is governed by the proposed principles. These results corroborate that the surface structure formation of complex predicates is a regular process.


At the Lexicon-Grammar Interface: The Case of Complex Predicates in the Functional Generative Description
Václava Kettnerová | Markéta Lopatková
Proceedings of the Third International Conference on Dependency Linguistics (Depling 2015)


Automatic Mapping Lexical Resources: A Lexical Unit as the Keystone
Eduard Bejček | Václava Kettnerová | Markéta Lopatková
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper presents the fully automatic linking of two valency lexicons of Czech verbs: VALLEX and PDT-VALLEX. Despite the same theoretical background adopted by these lexicons and the same linguistic phenomena they focus on, the fully automatic mapping of these resouces is not straightforward. We demonstrate that converting these lexicons into a common format represents a relatively easy part of the task whereas the automatic identification of pairs of corresponding valency frames (representing lexical units of verbs) poses difficulties. The overall achieved precision of 81% can be considered satisfactory. However, the higher number of lexical units a verb has, the lower the precision of their automatic mapping usually is. Moreover, we show that especially (i) supplementing further information on lexical units and (ii) revealing and reconciling regular discrepancies in their annotations can greatly assist in the automatic merging.

To Pay or to Get Paid: Enriching a Valency Lexicon with Diatheses
Anna Vernerová | Václava Kettnerová | Markéta Lopatková
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Valency lexicons typically describe only unmarked usages of verbs (the active form); however, verbs prototypically enter different surface structures. In this paper, we focus on the so-called diatheses, i.e., the relations between different surface syntactic manifestations of verbs that are brought about by changes in the morphological category of voice, e.g., the passive diathesis. The change in voice of a verb is prototypically associated with shifts of some of its valency complementations in the surface structure. These shifts are implied by changes in morphemic forms of the involved valency complementations and are regular enough to be captured by syntactic rules. However, as diatheses are lexically conditioned, their applicability to an individual lexical unit of a verb is not predictable from its valency frame alone. In this work, we propose a representation of this linguistic phenomenon in a valency lexicon of Czech verbs, VALLEX, with the aim to enhance this lexicon with the information on individual types of Czech diatheses. In order to reduce the amount of necessary manual annotation, a semi-automatic method is developed. This method draws evidence from a large morphologically annotated corpus, relying on grammatical constraints on the applicability of individual types of diatheses.


The Representation of Czech Light Verb Constructions in a Valency Lexicon
Václava Kettnerová | Markéta Lopatková
Proceedings of the Second International Conference on Dependency Linguistics (DepLing 2013)