SimIdioms: A Corpus and Benchmark for Ukrainian Idiom Translation

Yaryna Petruniv, Iuliia Makogon, Roman Kyslyi


Abstract
We present a corpus of aligned Ukrainian–English idiomatic expressions and a comprehensive evaluation of six large language models on the task of translating sentences containing idioms. The corpus is constructed by linking entries across multiple phraseological dictionaries and the MIDAS corpus using vector similarity search, enriched with figurative meanings, contextual sentences from the UberText fiction corpus, and semantic transparency scores. We evaluate Gemini 2.5 Flash, Claude Haiku 4.5, Gemma 3 12B, Qwen3-30B-A3B, LapaLM, and Tiny Aya Global in both Ukrainian-to-English and English-to-Ukrainian directions under default and context-augmented prompting. Our evaluation of 65{,}723 translations reveals a pronounced direction asymmetry, with all models performing substantially worse when translating into Ukrainian. Providing figurative meaning and target idiom candidates improves quality for most models in Ukrainian-to-English but has limited effect in the reverse direction. We additionally show that semantic transparency of idioms is only weakly correlated with translation quality. We release the corpus and evaluation framework to support research on idiomatic translation for mid-resource languages.
Anthology ID:
2026.unlp-1.5
Volume:
Proceedings of the Fifth Ukrainian Natural Language Processing Conference (UNLP 2026)
Month:
May
Year:
2026
Address:
Lviv, Ukraine
Editor:
Mariana Romanyshyn
Venue:
UNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
41–52
Language:
URL:
https://aclanthology.org/2026.unlp-1.5/
DOI:
Bibkey:
Cite (ACL):
Yaryna Petruniv, Iuliia Makogon, and Roman Kyslyi. 2026. SimIdioms: A Corpus and Benchmark for Ukrainian Idiom Translation. In Proceedings of the Fifth Ukrainian Natural Language Processing Conference (UNLP 2026), pages 41–52, Lviv, Ukraine. Association for Computational Linguistics.
Cite (Informal):
SimIdioms: A Corpus and Benchmark for Ukrainian Idiom Translation (Petruniv et al., UNLP 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.unlp-1.5.pdf