Artur Kiulian

2025

In this paper, we propose a model-agnostic cost-effective approach to developing bilingual base large language models (LLMs) to support English and any target language. The method includes vocabulary expansion, initialization of new embeddings, model training and evaluation. We performed our experiments with three languages, each using a non-Latin script—Ukrainian, Arabic, and Georgian.Our approach demonstrates improved language performance while reducing computational costs. It mitigates the disproportionate penalization of underrepresented languages, promoting fairness and minimizing adverse phenomena such as code-switching and broken grammar. Additionally, we introduce new metrics to evaluate language quality, revealing that vocabulary size significantly impacts the quality of generated text.

pdf bib

Towards Multilingual LLM Evaluation for Baltic and Nordic languages: A study on Lithuanian History
Yevhen Kostiuk | Oxana Vitman | Łukasz Gągała | Artur Kiulian
Proceedings of the 1st Workshop on Nordic-Baltic Responsible Evaluation and Alignment of Language Models (NB-REAL 2025)

pdf bib abs

Benchmarking Multimodal Models for Ukrainian Language Understanding Across Academic and Cultural Domains
Yurii Paniv | Artur Kiulian | Dmytro Chaplynskyi | Mykola Khandoga | Anton Polishko | Tetiana Bas | Guillermo Gabrielli
Proceedings of the Fourth Ukrainian Natural Language Processing Workshop (UNLP 2025)

While the evaluation of multimodal English-centric models is an active area of research with numerous benchmarks, there is a profound lack of benchmarks or evaluation suites for low- and mid-resource languages. We introduce ZNO-Vision, a comprehensive multimodal Ukrainian-centric benchmark derived from the standardized university entrance examination (ZNO). The benchmark consists of over 4300 expert-crafted questions spanning 12 academic disciplines, including mathematics, physics, chemistry, and humanities. We evaluated the performance of both open-source models and API providers, finding that only a handful of models performed above baseline. Alongside the new benchmark, we performed the first evaluation study of multimodal text generation for the Ukrainian language: we measured caption generation quality on the Multi30K-UK dataset. Lastly, we tested a few models from a cultural perspective on knowledge of national cuisine. We believe our work will advance multimodal generation capabilities for the Ukrainian language and our approach could be useful for other low-resource languages.

pdf bib

Evaluating LLM Judgment on Latvian and Lithuanian Short Answer Matching
Yevhen Kostiuk | Oxana Vitman | Łukasz Gągała | Artur Kiulian
Proceedings of the 1st Workshop on Nordic-Baltic Responsible Evaluation and Alignment of Language Models (NB-REAL 2025)

pdf bib abs

Framing the Language: Fine-Tuning Gemma 3 for Manipulation Detection
Mykola Khandoga | Yevhen Kostiuk | Anton Polishko | Kostiantyn Kozlov | Yurii Filipchuk | Artur Kiulian
Proceedings of the Fourth Ukrainian Natural Language Processing Workshop (UNLP 2025)

In this paper, we present our solutions for the two UNLP 2025 shared tasks: manipulation span detection and manipulation technique classification in Ukraine-related media content sourced from Telegram channels. We experimented with fine-tuning large language models (LLMs) with up to 12 billion parameters, including both encoder- and decoder-based architectures. Our experiments identified Gemma 3 12b with a custom classification head as the best-performing model for both tasks. To address the limited size of the original training dataset, we generated 50k synthetic samples and marked up an additional 400k media entries containing manipulative content.

2024

pdf bib abs

From Bytes to Borsch: Fine-Tuning Gemma and Mistral for the Ukrainian Language Representation
Artur Kiulian | Anton Polishko | Mykola Khandoga | Oryna Chubych | Jack Connor | Raghav Ravishankar | Adarsh Shirawalmath
Proceedings of the Third Ukrainian Natural Language Processing Workshop (UNLP) @ LREC-COLING 2024

In the rapidly advancing field of AI and NLP, generative large language models (LLMs) stand at the forefront of innovation, showcasing unparalleled abilities in text understanding and generation. However, the limited representation of low-resource languages like Ukrainian poses a notable challenge, restricting the reach and relevance of this technology. Our paper addresses this by fine-tuning the open-source Gemma and Mistral LLMs with Ukrainian datasets, aiming to improve their linguistic proficiency and benchmarking them against other existing models capable of processing Ukrainian language. This endeavor not only aims to mitigate language bias in technology but also promotes inclusivity in the digital realm. Our transparent and reproducible approach encourages further NLP research and development. Additionally, we present the Ukrainian Knowledge and Instruction Dataset (UKID) to aid future efforts in language model fine-tuning. Our research not only advances the field of NLP but also highlights the importance of linguistic diversity in AI, which is crucial for cultural preservation, education, and expanding AI’s global utility. Ultimately, we advocate for a future where technology is inclusive, enabling AI to communicate effectively across all languages, especially those currently underrepresented.

Venues

Fix author

Artur Kiulian

2025

2024

Co-authors

Venues