Benchmarking Korean Idiom Understanding: A Comparative Analysis of Local and Global Models

Xiaonan Wang, Seoyoon Park, Hansaem Kim


Abstract
Although an increasing number of multilingual LLMs (large language models) have begun to support Korean, there remains a notable lack of benchmark datasets specifically designed to evaluate their proficiency in Korean cultural and linguistic understanding. A major reason for this gap is that many available benchmarks in Korean are adapted from English originals via translation, which often fails to reflect the unique cultural context embedded in the Korean language. Even the few benchmark datasets based on native Korean data that involve cultural content typically focus on tasks such as bias or hate speech detection, where cultural knowledge serves merely as topical background rather than being integrated as a core component of semantic understanding. To address this gap, we introduce the Korean Idiom Matching Benchmark (KIM Bench), which consists of 1,175 instances. Idioms are culture-specific and often untranslatable, making them ideal for testing models’ cross-cultural semantic understanding. Using KIM Bench, We evaluate global and Korean native models. Our analysis show that larger and locally trained models better capture idiom semantics and cultural nuances, while chain-of-thought prompting may reduce accuracy. Models still struggle with deep semantic and contextual understanding. KIM Bench offers a compact tool for cross-cultural evaluation and insights into improving performance on culturally grounded tasks.
Anthology ID:
2025.ranlp-1.156
Volume:
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Month:
September
Year:
2025
Address:
Varna, Bulgaria
Editors:
Galia Angelova, Maria Kunilovskaya, Marie Escribe, Ruslan Mitkov
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
1341–1351
Language:
URL:
https://aclanthology.org/2025.ranlp-1.156/
DOI:
Bibkey:
Cite (ACL):
Xiaonan Wang, Seoyoon Park, and Hansaem Kim. 2025. Benchmarking Korean Idiom Understanding: A Comparative Analysis of Local and Global Models. In Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era, pages 1341–1351, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Benchmarking Korean Idiom Understanding: A Comparative Analysis of Local and Global Models (Wang et al., RANLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.ranlp-1.156.pdf