MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark

MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark Hongwei Liu author Zilong Zheng author Yuxuan Qiao author Haodong Duan author Zhiwei Fei author Fengzhe Zhou author Wenwei Zhang author Songyang Zhang author Dahua Lin author Kai Chen author 2024-08 text Findings of the Association for Computational Linguistics: ACL 2024 Lun-Wei Ku editor Andre Martins editor Vivek Srikumar editor Association for Computational Linguistics Bangkok, Thailand conference publication liu-etal-2024-mathbench 10.18653/v1/2024.findings-acl.411 https://aclanthology.org/2024.findings-acl.411/ 2024-08 6884 6915