Dimitar Kazakov


pdf bib
Machine Learning Models of Universal Grammar Parameter Dependencies
Dimitar Kazakov | Guido Cordoni | Andrea Ceolin | Monica-Alexandrina Irimia | Shin-Sook Kim | Dimitris Michelioudakis | Nina Radkevich | Cristina Guardiano | Giuseppe Longobardi
Proceedings of the Workshop Knowledge Resources for the Socio-Economic Sciences and Humanities associated with RANLP 2017

The use of parameters in the description of natural language syntax has to balance between the need to discriminate among (sometimes subtly different) languages, which can be seen as a cross-linguistic version of Chomsky’s (1964) descriptive adequacy, and the complexity of the acquisition task that a large number of parameters would imply, which is a problem for explanatory adequacy. Here we present a novel approach in which a machine learning algorithm is used to find dependencies in a table of parameters. The result is a dependency graph in which some of the parameters can be fully predicted from others. These empirical findings can be then subjected to linguistic analysis, which may either refute them by providing typological counter-examples of languages not included in the original dataset, dismiss them on theoretical grounds, or uphold them as tentative empirical laws worth of further study.

pdf bib
Building Dialectal Arabic Corpora
Hani Elgabou | Dimitar Kazakov
Proceedings of the Workshop Human-Informed Translation and Interpreting Technology

The aim of this research is to identify local Arabic dialects in texts from social media (Twitter) and link them to specific geographic areas. Dialect identification is studied as a subset of the task of language identification. The proposed method is based on unsupervised learning using simultaneously lexical and geographic distance. While this study focusses on Libyan dialects, the approach is general, and could produce resources to support human translators and interpreters when dealing with vernaculars rather than standard Arabic.


pdf bib
Using Parallel Corpora for Word Sense Disambiguation
Dimitar Kazakov | Ahmad R. Shahid
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013


pdf bib
Unsupervised Construction of a Multilingual WordNet from Parallel Corpora
Dimitar Kazakov | Ahmad R. Shahid
Proceedings of the Workshop on Natural Language Processing Methods and Corpora in Translation, Lexicography, and Language Learning


pdf bib
WordNet-based text document clustering
Julian Sedding | Dimitar Kazakov
Proceedings of the 3rd workshop on RObust Methods in Analysis of Natural Language Data (ROMAND 2004)