GeoEval: Benchmark for Evaluating LLMs and Multi-Modal Models on Geometry Problem-Solving

GeoEval: Benchmark for Evaluating LLMs and Multi-Modal Models on Geometry Problem-Solving Jiaxin Zhang author Zhong-Zhi Li author Ming-Liang Zhang author Fei Yin author Cheng-Lin Liu author Yashar Moshfeghi author 2024-08 text Findings of the Association for Computational Linguistics: ACL 2024 Lun-Wei Ku editor Andre Martins editor Vivek Srikumar editor Association for Computational Linguistics Bangkok, Thailand conference publication zhang-etal-2024-geoeval 10.18653/v1/2024.findings-acl.73 https://aclanthology.org/2024.findings-acl.73/ 2024-08 1258 1276