StatsChartMWP: A Dataset for Evaluating Multimodal Mathematical Reasoning Abilities on Math Word Problems with Statistical Charts

Dan Zhu, Tianqiao Liu, Zitao Liu


Abstract
Recent advancements in Large Multimodal Models (LMMs) have showcased their impressive capabilities in mathematical reasoning tasks in visual contexts. As a step toward developing AI models to conduct rigorous multi-step multimodal reasoning, we introduce StatsChartMWP, a real-world educational dataset for evaluating visual mathematical reasoning abilities on math word problems (MWPs) with statistical charts. Our dataset contains 8,514 chart-based MWPs, meticulously curated by K-12 educators within real-world teaching scenarios. We provide detailed preprocessing steps and manual annotations to help evaluate state-of-the-art models on StatsChartMWP. Comparing baselines, we find that current models struggle in undertaking meticulous multi-step mathematical reasoning among technical languages, diagrams, tables, and equations. Towards alleviate this gap, we introduce CoTAR, a chain-of-thought (CoT) augmented reasoning solution that fine-tunes the LMMs with solution-oriented CoT-alike reasoning steps. The LMM trained with CoTAR is more effective than current open-source approaches. We conclude by shedding lights on challenges and opportunities in enhancement in LMMs and steer future research and development efforts in the realm of statistical chart comprehension and analysis. The code and data are available at https://github.com/ai4ed/StatsChartMWP.
Anthology ID:
2025.findings-emnlp.695
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12944–12954
Language:
URL:
https://aclanthology.org/2025.findings-emnlp.695/
DOI:
Bibkey:
Cite (ACL):
Dan Zhu, Tianqiao Liu, and Zitao Liu. 2025. StatsChartMWP: A Dataset for Evaluating Multimodal Mathematical Reasoning Abilities on Math Word Problems with Statistical Charts. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 12944–12954, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
StatsChartMWP: A Dataset for Evaluating Multimodal Mathematical Reasoning Abilities on Math Word Problems with Statistical Charts (Zhu et al., Findings 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.findings-emnlp.695.pdf
Checklist:
 2025.findings-emnlp.695.checklist.pdf