A Graph Talks, But Who’s Listening? Rethinking Evaluations for Graph-Language Models

Soham Petkar; Hari Aakash K; Anirudh Vempati; Akshit Sinha; Ponnurangam Kumaraguru; Chirag Agarwal

doi:10.18653/v1/2026.findings-acl.1624

A Graph Talks, But Who’s Listening? Rethinking Evaluations for Graph-Language Models

Soham Petkar, Hari Aakash K, Anirudh Vempati, Akshit Sinha, Ponnurangam Kumaraguru, Chirag Agarwal

Abstract

Recent research has extensively explored the graph-reasoning capabilities of Large Language Models (LLMs) through textual descriptions. However, benchmarks specifically designed for Graph-Language Models (GLMs), which integrate Graph Neural Networks (GNNs) with LLMs, remain significantly underdeveloped. In this work, we first demonstrate that existing GLM evaluations, largely repurposed from unimodal node and edge level tasks, fail to assess true multimodal integration. Our analysis reveals that strong performance on these benchmarks is achievable using textual or structural features in isolation, bypassing the need for joint reasoning. To bridge this gap, we introduce CLEGR (Compositional Language-Graph Reasoning), a benchmark explicitly designed to evaluate multimodal reasoning over graph topology and textual semantics. Evaluation of representative GLMs on CLEGR shows that they exhibit significant performance degradation on CLEGR tasks and unimodal soft-prompted LLMs perform on par with complex multimodal GLMs. These findings collectively highlight limitations in the graph reasoning capabilities of existing GLMs and provide a foundation for advancing the community toward explicit multimodal reasoning involving graph structure and language.

Anthology ID:: 2026.findings-acl.1624
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 32441–32462
Language:
URL:: https://aclanthology.org/2026.findings-acl.1624/
DOI:: 10.18653/v1/2026.findings-acl.1624
Bibkey:
Cite (ACL):: Soham Petkar, Hari Aakash K, Anirudh Vempati, Akshit Sinha, Ponnurangam Kumaraguru, and Chirag Agarwal. 2026. A Graph Talks, But Who’s Listening? Rethinking Evaluations for Graph-Language Models. In Findings of the Association for Computational Linguistics: ACL 2026, pages 32441–32462, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: A Graph Talks, But Who’s Listening? Rethinking Evaluations for Graph-Language Models (Petkar et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.1624.pdf
Checklist:: 2026.findings-acl.1624.checklist.pdf

PDF Cite Search Checklist Fix data