Does It Run and Is That Enough? Revisiting Text-to-Chart Generation with a Multi-Agent Approach

James Ford, Anthony Rios


Abstract
Large language models can translate natural-language chart descriptions into runnable code, yet approximately 15% of the generated scripts still fail to execute, even after supervised fine-tuning and reinforcement learning. We investigate whether this persistent error rate stems from model limitations or from reliance on a single-prompt design. To explore this, we propose a lightweight multi-agent pipeline that separates drafting, execution, repair, and judgment, using only an off-the-shelf GPT-4o-mini model. On the Text2Chart31 benchmark, our system reduces execution errors to 4.5% within three repair iterations, outperforming the strongest fine-tuned baseline by nearly 5 percentage points while requiring significantly less compute. Similar performance is observed on the ChartX benchmark, with an error rate of 4.6%, demonstrating strong generalization. Under current benchmarks, execution success appears largely solved. However, manual review reveals that 6 out of 100 sampled charts contain hallucinations, and an LLM-based accessibility audit shows that only 33.3% (Text2Chart31) and 7.2% (ChartX) of generated charts satisfy basic colorblindness guidelines. These findings suggest that future work should shift focus from execution reliability toward improving chart aesthetics, semantic fidelity, and accessibility.
Anthology ID:
2025.findings-emnlp.1371
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
25160–25173
Language:
URL:
https://aclanthology.org/2025.findings-emnlp.1371/
DOI:
Bibkey:
Cite (ACL):
James Ford and Anthony Rios. 2025. Does It Run and Is That Enough? Revisiting Text-to-Chart Generation with a Multi-Agent Approach. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 25160–25173, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Does It Run and Is That Enough? Revisiting Text-to-Chart Generation with a Multi-Agent Approach (Ford & Rios, Findings 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.findings-emnlp.1371.pdf
Checklist:
 2025.findings-emnlp.1371.checklist.pdf