OpenHuEval: Evaluating Large Language Model on Hungarian Specifics

Haote Yang; Xingjian Wei; Jiang Wu; Noémi Ligeti-Nagy; Jiaxing Sun; Yinfan Wang; Győző Zijian Yang; Junyuan Gao; Jingchao Wang; Bowen Jiang; Shasha Wang; Nanjun Yu; Zihao Zhang; Shixin Hong; Hongwei Liu; Wei Li; Songyang Zhang; Dahua Lin; Lijun Wu; Gabor Proszeky; Conghui He

doi:10.18653/v1/2025.findings-acl.390

OpenHuEval: Evaluating Large Language Model on Hungarian Specifics

Haote Yang, Xingjian Wei, Jiang Wu, Noémi Ligeti-Nagy, Jiaxing Sun, Yinfan Wang, Győző Zijian Yang, Junyuan Gao, Jingchao Wang, Bowen Jiang, Shasha Wang, Nanjun Yu, Zihao Zhang, Shixin Hong, Hongwei Liu, Wei Li, Songyang Zhang, Dahua Lin, Lijun Wu, Gábor Prószéky, Conghui He

Abstract

We introduce OpenHuEval, the first benchmark for LLMs focusing on the Hungarian language and specifics. OpenHuEval is constructed from a vast collection of Hungarian-specific materials sourced from multiple origins. In the construction, we incorporated the latest design principles for evaluating LLMs, such as using real user queries from the internet, emphasizing the assessment of LLMs’ generative capabilities, and employing LLM-as-judge to enhance the multidimensionality and accuracy of evaluations. Ultimately, OpenHuEval encompasses eight Hungarian-specific dimensions, featuring five tasks and 3953 questions. Consequently, OpenHuEval provides the comprehensive, in-depth, and scientifically accurate assessment of LLM performance in the context of the Hungarian language and its specifics. We evaluated current mainstream LLMs, including both traditional LLMs and recently developed Large Reasoning Models. The results demonstrate the significant necessity for evaluation and model optimization tailored to the Hungarian language and specifics. We also established the framework for analyzing the thinking processes of LRMs with OpenHuEval, revealing intrinsic patterns and mechanisms of these models in non-English languages, with Hungarian serving as a representative example. We will release OpenHuEval at https://github.com/opendatalab/OpenHuEval .

Anthology ID:: 2025.findings-acl.390
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7464–7520
Language:
URL:: https://aclanthology.org/2025.findings-acl.390/
DOI:: 10.18653/v1/2025.findings-acl.390
Bibkey:
Cite (ACL):: Haote Yang, Xingjian Wei, Jiang Wu, Noémi Ligeti-Nagy, Jiaxing Sun, Yinfan Wang, Győző Zijian Yang, Junyuan Gao, Jingchao Wang, Bowen Jiang, Shasha Wang, Nanjun Yu, Zihao Zhang, Shixin Hong, Hongwei Liu, Wei Li, Songyang Zhang, Dahua Lin, Lijun Wu, Gábor Prószéky, and Conghui He. 2025. OpenHuEval: Evaluating Large Language Model on Hungarian Specifics. In Findings of the Association for Computational Linguistics: ACL 2025, pages 7464–7520, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: OpenHuEval: Evaluating Large Language Model on Hungarian Specifics (Yang et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-acl.390.pdf

PDF Cite Search Fix data