Milos Pacak


Slavic languages—comparative morphosyntactic research
Milos Pacak
Proceedings of the Annual meeting of the Association for Machine Translation and Computational Linguistics

An appropriate goal for present-day linguistics is the development of a general theory of relations between languages. One necessary requirement in the development of such a theory is the identification and classification of inflected forms in terms of their morphosyntactic properties in a set of presumably related languages. According to Sapir, “all languages differ from one another, but certain ones differ far more than others”. As for the Slavic languages he might well have said that they are all alike, but some are more alike than others. The similarities stemming from their common origin and from subsequent parallel development enable us to group them into a number of more or less homogeneous types. The experimental comparative research at The Georgetown University was focused on a group of four Slavic languages, namely, Russian, Czech, Polish and Serbocroatian. The first step in the comparative procedure here described is the morphosyntactic analysis of each of the four languages individually. The analysis should be based on the complementary distribution of inflectional morphemes. The properties whose distribution must be determined are: 1) the graphemic shape of the inflectional morphemes, 2) the establishment of distributional classes and subclasses of stem morphemes and (on the basis of 1 and 2), 3) the morphosyntactic function of inflectional morphemes which is determined by the distributional subclass of the stem morpheme. f(x,y)-l, where x is the distributional subclass of the stem morpheme (which is a constant) and y is the given inflectional morpheme (which is a free variable). On the basis of this preliminary analysis the patterns of absolute equivalence, partial equivalence, and absolute difference can be established for each class of inflected forms in each language under study. Once this has been accomplished, the results can be used in order to determine the extent of distributional equivalences among the individual languages. The applicability of this procedure was tested on the class of adjectivals. Within the frame of adjectivals the following morphosyntactic properties were analyzed within each language languages: 1) the category of gender, 2) the category of animateness, 3) the category of case and murder. The product of this comparative analysis is a set of formation rules which embody a system for the identification of the inflected forms. The detailed result will be presented in an additional report.