Marek Kubis

2026

A Benchmark for Audio Reasoning Capabilities of Multimodal Large Language Models
Iwona Christop | Mateusz Czyżnikiewicz | Paweł Skórzewski | Łukasz Bondaruk | Jakub Kubiak | Marcin Lewandowski | Marek Kubis
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

The present benchmarks for testing the audio modality of multimodal large language models concentrate on testing various audio tasks such as speaker diarization or gender identification in isolation. Whether a multimodal model can answer the questions that require reasoning skills to combine audio tasks of different categories cannot be verified with their use. To address this issue, we propose Audio Reasoning Tasks (ART), a new benchmark for assessing the ability of multimodal models to solve problems that require reasoning over audio signal.

2025

pdf bib abs

Polish-English medical knowledge transfer: A new benchmark and results
Łukasz Grzybowski | Jakub Pokrywka | Michał Ciesiółka | Jeremi Ignacy Kaczmarek | Marek Kubis
Findings of the Association for Computational Linguistics: EMNLP 2025

Large Language Models (LLMs) have demonstrated significant potential in specialized tasks, including medical problem-solving. However, most studies predominantly focus on English-language contexts. This study introduces a novel benchmark dataset based on Polish medical licensing and specialization exams (LEK, LDEK, PES). The dataset, sourced from publicly available materials provided by the Medical Examination Center and the Chief Medical Chamber, includes Polish medical exam questions, along with a subset of parallel Polish-English corpora professionally translated for foreign candidates. By structuring a benchmark from these exam questions, we evaluate state-of-the-art LLMs, spanning general-purpose, domain-specific, and Polish-specific models, and compare their performance with that of human medical students and doctors. Our analysis shows that while models like GPT-4o achieve near-human performance, challenges persist in cross-lingual translation and domain-specific understanding. These findings highlight disparities in model performance across languages and medical specialties, emphasizing the limitations and ethical considerations of deploying LLMs in clinical practice.

pdf bib abs

Improving AI assistants embedded in short e-learning courses with limited textual content
Jacek Marciniak | Marek Kubis | Michał Gulczyński | Adam Szpilkowski | Adam Wieczarek | Marcin Szczepański
Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)

This paper presents a strategy for improving AI assistants embedded in short e-learning courses. The proposed method is implemented within a Retrieval-Augmented Generation (RAG) architecture and evaluated using several retrieval variants. The results show that query quality improves when the knowledge base is enriched with definitions of key concepts discussed in the course. Our main contribution is a lightweight enhancement approach that increases response quality without overloading the course with additional instructional content.

2024

pdf bib abs

This paper presents the POLygraph dataset, a unique resource for fake news detection in Polish. The dataset, created by an interdisciplinary team, is composed of two parts: the “fake-or-not” dataset with 11,360 pairs of news articles (identified by their URLs) and corresponding labels, and the “fake-they-say” dataset with 5,082 news articles (identified by their URLs) and tweets commenting on them. Unlike existing datasets, POLygraph encompasses a variety of approaches from source literature, providing a comprehensive resource for fake news detection. The data was collected through manual annotation by expert and non-expert annotators. The project also developed a software tool that uses advanced machine learning techniques to analyze the data and determine content authenticity. The tool and dataset are expected to benefit various entities, from public sector institutions to publishers and fact-checking organizations. Further dataset exploration will foster fake news detection and potentially stimulate the implementation of similar models in other languages. The paper focuses on the creation and composition of the dataset, so it does not include a detailed evaluation of the software tool for content authenticity analysis, which is planned at a later stage of the project.

pdf bib abs

Using Bibliodata LODification to Create Metadata-Enriched Literary Corpora in Line with FAIR Principles
Agnieszka Karlinska | Cezary Rosiński | Marek Kubis | Patryk Hubar | Jan Wieczorek
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

This paper discusses the design principles and procedures for creating a balanced corpus for research in computational literary studies, building on the experience of computational linguistics but adapting it to the specificities of the digital humanities. It showcases the development of the Metadata-enriched Polish Novel Corpus from the 19th and 20th centuries (19/20MetaPNC), consisting of 1,000 novels from 1854–1939, as an illustrative case and proposes a comprehensive workflow for the creation and reuse of literary corpora. What sets 19/20MetaPNC apart is its approach to balance, which considers the spatial dimension, the inclusion of non-canonical texts previously overlooked by other corpora, and the use of a complex, multi-stage metadata enrichment and verification process. Emphasis is placed on research-oriented metadata design, efficient data collection and data sharing according to the FAIR principles as well as 5- and 7-star data standards to increase the visibility and reusability of the corpus. A knowledge graph-based solution for the creation of exchangeable and machine-readable metadata describing corpora has been developed. For this purpose, metadata from bibliographic catalogs and other sources were transformed into Linked Data following the bibliodata LODification approach.

pdf bib abs

Two Approaches to Diachronic Normalization of Polish Texts
Kacper Dudzic | Filip Gralinski | Krzysztof Jassem | Marek Kubis | Piotr Wierzchon
Proceedings of the 8th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2024)

This paper discusses two approaches to the diachronic normalization of Polish texts: a rule-based solution that relies on a set of handcrafted patterns, and a neural normalization model based on the text-to-text transfer transformer architecture. The training and evaluation data prepared for the task are discussed in detail, along with experiments conducted to compare the proposed normalization solutions. A quantitative and qualitative analysis is made. It is shown that at the current stage of inquiry into the problem, the rule-based solution outperforms the neural one on 3 out of 4 variants of the prepared dataset, although in practice both approaches have distinct advantages and disadvantages.

2023

pdf bib abs

Back Transcription as a Method for Evaluating Robustness of Natural Language Understanding Models to Speech Recognition Errors
Marek Kubis | Paweł Skórzewski | Marcin Sowański | Tomasz Ziętkiewicz
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

In a spoken dialogue system, an NLU model is preceded by a speech recognition system that can deteriorate the performance of natural language understanding. This paper proposes a method for investigating the impact of speech recognition errors on the performance of natural language understanding models. The proposed method combines the back transcription procedure with a fine-grained technique for categorizing the errors that affect the performance of NLU models. The method relies on the usage of synthesized speech for NLU evaluation. We show that the use of synthesized speech in place of audio recording does not change the outcomes of the presented technique in a significant way.

2022

pdf bib abs

Towards a contextualised spatial-diachronic history of literature: mapping emotional representations of the city and the country in Polish fiction from 1864 to 1939
Agnieszka Karlińska | Cezary Rosiński | Jan Wieczorek | Patryk Hubar | Jan Kocoń | Marek Kubis | Stanisław Woźniak | Arkadiusz Margraf | Wiktor Walentynowicz
Proceedings of the 6th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

In this article, we discuss the conditions surrounding the building of historical and literary corpora. We describe the assumptions and method of making the original corpus of the Polish novel (1864-1939). Then, we present the research procedure aimed at demonstrating the variability of the emotional value of the concept of “the city” and “the country” in the texts included in our corpus. The proposed method considers the complex socio-political nature of Central and Eastern Europe, especially the fact that there was no unified Polish state during this period. The method can be easily replicated in studies of the literature of countries with similar specificities.

2020

pdf bib abs

Geometric Deep Learning Models for Linking Character Names in Novels
Marek Kubis
Proceedings of the 4th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

The paper investigates the impact of using geometric deep learning models on the performance of a character name linking system. The neural models that contain graph convolutional layers are confronted with the models that include conventional fully connected layers. The evaluation is performed with respect to the perfect name boundaries obtained from the test set and in a more demanding end-to-end setting where the character name linking system is preceded by a named entity recognizer.

2017

pdf bib abs

EUDAMU at SemEval-2017 Task 11: Action Ranking and Type Matching for End-User Development
Marek Kubis | Paweł Skórzewski | Tomasz Ziętkiewicz
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

The paper describes a system for end-user development using natural language. Our approach uses a ranking model to identify the actions to be executed followed by reference and parameter matching models to select parameter values that should be set for the given commands. We discuss the results of evaluation and possible improvements for future work.

2010

pdf bib abs

PolNet — Polish WordNet: Data and Tools
Zygmunt Vetulani | Marek Kubis | Tomasz Obrębski
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper presents the PolNet-Polish WordNet project which aims at building a linguistically oriented ontology for Polish compatible with other WordNet projects such as Princeton WordNet, EuroWordNet and other similarly organized ontologies. The main idea behind this kind of ontologies is to use words related by synonymy to construct formal representations of concepts. In the paper we sketch the PolNet project methodology and implementation. We present data obtained so far, as well as the WQuery tool for querying and maintaining PolNet. WQuery is a query language that make use of data types based on synsets, word senses and various semantic relations which occur in wordnet-like lexical databases. The tool is particularly useful to deal with complex querying tasks like searching for cycles in semantic relations, finding isolated synsets or computing overall statistics. Both data and tools presented in this paper have been applied within an advanced AI system POLINT-112-SMS with emulated natural language competence, where they are used in the understanding subsystem.

Marek Kubis

2026

2025

2024

2023

2022

2020

2017

2010

Co-authors

Venues