IGenBench: Benchmarking the Reliability of Text-to-Infographic Generation

Yinghao Tang; Xueding Liu; Boyuan Zhang; Tingfeng Lan; Yupeng Xie; Jiale Lao; Yiyao Wang; Haoxuan Li; Tingting Gao; Bo Pan; Luoxuan Weng; Xiuqi Huang; Minfeng Zhu; Yingchaojie Feng; Yuyu Luo; Wei Chen

IGenBench: Benchmarking the Reliability of Text-to-Infographic Generation

Yinghao Tang, Xueding Liu, Boyuan Zhang, Tingfeng Lan, Yupeng Xie, Jiale Lao, Yiyao Wang, Haoxuan Li, Tingting Gao, Bo Pan, Luoxuan Weng, Xiuqi Huang, Minfeng Zhu, Yingchaojie Feng, Yuyu Luo, Wei Chen

Abstract

Infographics are composite visual artifacts that combine data visualizations with textual and illustrative elements to communicate information. While recent text-to-image (T2I) models can generate aesthetically appealing images, their reliability in generating infographics remains unclear. Generated infographics may appear correct at first glance but contain easily overlooked issues, such as distorted data encoding or incorrect textual content. We present IGenBench, the first benchmark for evaluating the reliability of text-to-infographic generation, comprising 600 curated test cases spanning 30 infographic types. We design an automated evaluation framework that decomposes reliability verification into atomic yes/no questions based on a taxonomy of 10 question types. We employ multimodal large language models (MLLMs) to verify each question, yielding question-level accuracy (Q-ACC) and infographic-level accuracy (I-ACC). We comprehensively evaluate 10 state-of-the-art T2I models on IGenBench. Our systematic analysis reveals key insights for future model development: (i) a three-tier performance hierarchy with the top model achieving Q-ACC of 0.90 but I-ACC of only 0.49; (ii) data-related dimensions emerging as universal bottlenecks (e.g., Data Completeness: 0.21); and (iii) the challenge of achieving end-to-end correctness across all models.

Anthology ID:: 2026.acl-long.1713
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 36919–36954
Language:
URL:: https://aclanthology.org/2026.acl-long.1713/
DOI:
Bibkey:
Cite (ACL):: Yinghao Tang, Xueding Liu, Boyuan Zhang, Tingfeng Lan, Yupeng Xie, Jiale Lao, Yiyao Wang, Haoxuan Li, Tingting Gao, Bo Pan, Luoxuan Weng, Xiuqi Huang, Minfeng Zhu, Yingchaojie Feng, Yuyu Luo, and Wei Chen. 2026. IGenBench: Benchmarking the Reliability of Text-to-Infographic Generation. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 36919–36954, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: IGenBench: Benchmarking the Reliability of Text-to-Infographic Generation (Tang et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.1713.pdf
Checklist:: 2026.acl-long.1713.checklist.pdf

PDF Cite Search Checklist Fix data