Daria Ryzhova


2022

pdf bib
Multilingual Pragmaticon: Database of Discourse Formulae
Anton Buzanov | Polina Bychkova | Arina Molchanova | Anna Postnikova | Daria Ryzhova
Proceedings of the Thirteenth Language Resources and Evaluation Conference

The paper presents a multilingual database aimed to be used as a tool for typological analysis of response constructions called discourse formulae (DF), cf. English ‘No way¡ or French ‘Ça va¡ ( ‘all right’). The two primary qualities that make DF of theoretical interest for linguists are their idiomaticity and the special nature of their meanings (cf. consent, refusal, negation), determined by their dialogical function. The formal and semantic structures of these items are language-specific. Compiling a database with DF from various languages would help estimate the diversity of DF in both of these aspects, and, at the same time, establish some frequently occurring patterns. The DF in the database are accompanied with glosses and assigned with multiple tags, such as pragmatic function, additional semantics, the illocutionary type of the context, etc. As a starting point, Russian, Serbian and Slovene DF are included into the database. This data already shows substantial grammatical and lexical variability.

2016

pdf bib
Typology of Adjectives Benchmark for Compositional Distributional Models
Daria Ryzhova | Maria Kyuseva | Denis Paperno
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In this paper we present a novel application of compositional distributional semantic models (CDSMs): prediction of lexical typology. The paper introduces the notion of typological closeness, which is a novel rigorous formalization of semantic similarity based on comparison of multilingual data. Starting from the Moscow Database of Qualitative Features for adjective typology, we create four datasets of typological closeness, on which we test a range of distributional semantic models. We show that, on the one hand, vector representations of phrases based on data from one language can be used to predict how words within the phrase translate into different languages, and, on the other hand, that typological data can serve as a semantic benchmark for distributional models. We find that compositional distributional models, especially parametric ones, perform way above non-compositional alternatives on the task.