Are LLMs Truly Graph-Savvy? A Comprehensive Evaluation of Graph Generation

Ege Demirci; Rithwik Kerur; Ambuj Singh

doi:10.18653/v1/2025.acl-srw.64

Are LLMs Truly Graph-Savvy? A Comprehensive Evaluation of Graph Generation

Abstract

While large language models (LLMs) have demonstrated impressive capabilities across diverse tasks, their ability to generate valid graph structures remains underexplored. We evaluate fifteen state-of-the-art LLMs on five specialized graph generation tasks spanning delivery networks, social networks, quantum circuits, gene-disease networks, and transportation systems. We also test the LLMs using 3 different prompt types: direct, iterative feedback, and program-augmented. Models supported with explicit reasoning modules (o3-mini-high, o1, Claude 3.7 Sonnet, DeepSeek-R1) solve more than twice as many tasks as their general-purpose peers, independent of parameter count. Error forensics reveals two recurring failure modes: smaller parameter size Llama models often violate basic structural constraints, whereas Claude models respect topology but mismanage higher-order logical rules. Allowing models to refine their answers iteratively yields uneven gains, underscoring fundamental differences in error-correction capacity. This work demonstrates that graph competence stems from specialized training methodologies rather than scale, establishing a framework for developing truly graph-savvy language models. Results and verification scripts available at https://github.com/egedemirci/Are-LLMs-Truly-Graph-Savvy-A-Comprehensive-Evaluation-of-Graph-Generation.

Anthology ID:: 2025.acl-srw.64
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Jin Zhao, Mingyang Wang, Zhu Liu
Venues:: ACL | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 884–897
Language:
URL:: https://aclanthology.org/2025.acl-srw.64/
DOI:: 10.18653/v1/2025.acl-srw.64
Bibkey:
Cite (ACL):: Ege Demirci, Rithwik Kerur, and Ambuj Singh. 2025. Are LLMs Truly Graph-Savvy? A Comprehensive Evaluation of Graph Generation. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 884–897, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Are LLMs Truly Graph-Savvy? A Comprehensive Evaluation of Graph Generation (Demirci et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-srw.64.pdf

PDF Cite Search Fix data