2017
pdf
bib
abs
Stylometric Analysis of Parliamentary Speeches: Gender Dimension
Justina Mandravickaitė
|
Tomas Krilavičius
Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing
Relation between gender and language has been studied by many authors, however, there is still some uncertainty left regarding gender influence on language usage in the professional environment. Often, the studied data sets are too small or texts of individual authors are too short in order to capture differences of language usage wrt gender successfully. This study draws from a larger corpus of speeches transcripts of the Lithuanian Parliament (1990-2013) to explore language differences of political debates by gender via stylometric analysis. Experimental set up consists of stylistic features that indicate lexical style and do not require external linguistic tools, namely the most frequent words, in combination with unsupervised machine learning algorithms. Results show that gender differences in the language use remain in professional environment not only in usage of function words, preferred linguistic constructions, but in the presented topics as well.
pdf
bib
abs
Identification of Multiword Expressions for Latvian and Lithuanian: Hybrid Approach
Justina Mandravickaitė
|
Tomas Krilavičius
Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)
We discuss an experiment on automatic identification of bi-gram multi-word expressions in parallel Latvian and Lithuanian corpora. Raw corpora, lexical association measures (LAMs) and supervised machine learning (ML) are used due to deficit and quality of lexical resources (e.g., POS-tagger, parser) and tools. While combining LAMs with ML is rather effective for other languages, it has shown some nice results for Lithuanian and Latvian as well. Combining LAMs with ML we have achieved 92,4% precision and 52,2% recall for Latvian and 95,1% precision and 77,8% recall for Lithuanian.
2016
pdf
bib
abs
NLP Infrastructure for the Lithuanian Language
Daiva Vitkutė-Adžgauskienė
|
Andrius Utka
|
Darius Amilevičius
|
Tomas Krilavičius
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
The Information System for Syntactic and Semantic Analysis of the Lithuanian language (lith. Lietuvių kalbos sintaksinės ir semantinės analizės informacinė sistema, LKSSAIS) is the first infrastructure for the Lithuanian language combining Lithuanian language tools and resources for diverse linguistic research and applications tasks. It provides access to the basic as well as advanced natural language processing tools and resources, including tools for corpus creation and management, text preprocessing and annotation, ontology building, named entity recognition, morphosyntactic and semantic analysis, sentiment analysis, etc. It is an important platform for researchers and developers in the field of natural language technology.
2015
pdf
bib
Automatic Thematic Classification of the Titles of the Seimas Votes
Vytautas Mickevičius
|
Tomas Krilavičius
|
Vaidas Morkevičius
|
Aušra Mackutė-Varoneckienė
Proceedings of the 20th Nordic Conference of Computational Linguistics (NODALIDA 2015)
pdf
bib
Classification of Short Legal Lithuanian Texts
Vytautas Mickevičius
|
Tomas Krilavičius
|
Vaidas Morkevičius
The 5th Workshop on Balto-Slavic Natural Language Processing
2013
pdf
bib
A Comparison of Approaches for Sentiment Classification on Lithuanian Internet Comments
Jurgita Kapočiūtė-Dzikienė
|
Algis Krupavičius
|
Tomas Krilavičius
Proceedings of the 4th Biennial International Workshop on Balto-Slavic Natural Language Processing