AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models

AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models Wanjun Zhong author Ruixiang Cui author Yiduo Guo author Yaobo Liang author Shuai Lu author Yanlin Wang author Amin Saied author Weizhu Chen author Nan Duan author 2024-06 text Findings of the Association for Computational Linguistics: NAACL 2024 Kevin Duh editor Helena Gomez editor Steven Bethard editor Association for Computational Linguistics Mexico City, Mexico conference publication zhong-etal-2024-agieval 10.18653/v1/2024.findings-naacl.149 https://aclanthology.org/2024.findings-naacl.149/ 2024-06 2299 2314