Łukasz Kobyliński

2025

Proceedings of the PolEval 2025 Workshop
Łukasz Kobyliński | Alina Wróblewska | Maciej Ogrodniczuk
Proceedings of the PolEval 2025 Workshop

pdf bib abs

PolEval 2025
Łukasz Kobyliński | Ryszard Staruch | Alina Wróblewska | Maciej Ogrodniczuk
Proceedings of the PolEval 2025 Workshop

PolEval is an annual shared-task evaluation campaign dedicated to advancing natural language processing for the Polish language. This paper presents an overview of PolEval 2025, the eighth edition of the campaign, which included three completed tasks covering machine-generated text detection, gender-inclusive language generation, and speech emotion recognition. The evaluation was conducted using standardised datasets and metrics via the AmuEval platform. PolEval 2025 attracted 15 teams and over 100 submissions, demonstrating continued engagement from the Polish NLP community. We describe the organisation of the campaign, the evaluation setup, and the role of PolEval in fostering reproducible research and community-driven benchmarking.

2019

pdf bib abs

Deep Learning in Event Detection in Polish
Łukasz Kobyliński | Michał Wasiluk
Proceedings of the 10th Global Wordnet Conference

Event detection is an important NLP task that has been only recently tackled in the context of Polish, mostly due to lack of language resources. The available annotated corpora are still relatively small and supervised learning approaches are limited by the size of training datasets. Event detection tools are very much needed, as they can be used to annotate more language resources automatically and to improve the accuracy of other NLP tasks, which rely on the detection of events, such as question answering or machine translation. In this paper we present a deep learning based approach to this task, which proved to capture the knowledge contained in the training data most effectively and outperform previously proposed methods. We show a direct comparison to previously published results, using the same data and experimental setup.

2014

pdf bib abs

PoliTa: A multitagger for Polish
Łukasz Kobyliński
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Part-of-Speech (POS) tagging is a crucial task in Natural Language Processing (NLP). POS tags may be assigned to tokens in text manually, by trained linguists, or using algorithmic approaches. Particularly, in the case of annotated text corpora, the quantity of textual data makes it unfeasible to rely on manual tagging and automated methods are used extensively. The quality of such methods is of critical importance, as even 1% tagger error rate results in introducing millions of errors in a corpus consisting of a billion tokens. In case of Polish several POS taggers have been proposed to date, but even the best of the taggers achieves an accuracy of ca. 93%, as measured on the one million subcorpus of the National Corpus of Polish (NCP). As the task of tagging is an example of classification, in this article we introduce a new POS tagger for Polish, which is based on the idea of combining several classifiers to produce higher quality tagging results than using any of the taggers individually.

Co-authors

Venues

Fix author