Catalin Mititelu

Also published as: Cătălin Mititelu


pdf bib
Aligning the Romanian Reference Treebank and the Valence Lexicon of Romanian Verbs
Ana-Maria Barbu | Verginica Barbu Mititelu | Cătălin Mititelu
Proceedings of the Thirteenth Language Resources and Evaluation Conference

We present here the efforts of aligning two language resources for Romanian: the Romanian Reference Treebank and the Valence Lexicon of Romanian Verbs: for each occurrence of those verbs in the treebank that were included as entries in the lexicon, a set of valence frames is automatically assigned, then manually validated by two linguists and, when necessary, corrected. Validating a valence frame also means semantically disambiguating the verb in the respective context. The validation is done by two linguists, on complementary datasets. However, a subset of verbs were validated by both annotators and Cohen’s κ is 0.87 for this subset. The alignment we have made also serves as a method of enhancing the quality of the two resources, as in the process we identify morpho-syntactic annotation mistakes, incomplete valence frames or missing ones. Information from each resource complements the information from the other, thus their value increases. The treebank and the lexicon are freely available, while the links discovered between them are also made available on GitHub.


pdf bib
Syntactic Semantic Correspondence in Dependency Grammar
Cătălina Mărănduc | Cătălin Mititelu | Victoria Bobicev
Proceedings of the 16th International Workshop on Treebanks and Linguistic Theories

pdf bib
A Multiform Balanced Dependency Treebank for Romanian
Mihaela Colhon | Cătălina Mărănduc | Cătălin Mititelu
Proceedings of the Workshop Knowledge Resources for the Socio-Economic Sciences and Humanities associated with RANLP 2017

The UAIC-RoDia-DepTb is a balanced treebank, containing texts in non-standard language: 2,575 chats sentences, old Romanian texts (a Gospel printed in 1648, a codex of laws printed in 1818, a novel written in 1910), regional popular poetry, legal texts, Romanian and foreign fiction, quotations. The proportions are comparable; each of these types of texts is represented by subsets of at least 1,000 phrases, so that the parser can be trained on their peculiarities. The annotation of the treebank started in 2007, and it has classical tags, such as those in school grammar, with the intention of using the resource for didactic purposes. The classification of circumstantial modifiers is rich in semantic information. We present in this paper the development in progress of this resource which has been automatically annotated and entirely manually corrected. We try to add new texts, and to make it available in more formats, by keeping all the morphological and syntactic information annotated, and adding logical-semantic information. We will describe here two conversions, from the classic syntactic format into Universal Dependencies format and into a logical-semantic layer, which will be shortly presented.


pdf bib
Catalin Mititelu | Verginica Barbu Mititelu
Proceedings of the 4th Workshop on Cognitive Aspects of the Lexicon (CogALex)