LVLMs and Humans Ground Differently in Referential Communication

Peter Zeng; Weiling Li; Amie J. Paige; Zhengxiang Wang; Panagiotis Kaliosis; Dimitris Samaras; Gregory J. Zelinsky; Susan E. Brennan; Owen Rambow

LVLMs and Humans Ground Differently in Referential Communication

Peter Zeng, Weiling Li, Amie J. Paige, Zhengxiang Wang, Panagiotis Kaliosis, Dimitris Samaras, Gregory J. Zelinsky, Susan Brennan, Owen Rambow

Abstract

For generative AI agents to partner effectively with human users, the ability to accurately predict human intent is critical. But this ability to collaborate remains limited by a critical deficit: an inability to model common ground. We present a referential communication experiment with a factorial design involving director-matcher pairs (human-human, human-AI, AI-human, and AI-AI) that interact with multiple turns in repeated rounds to match pictures of objects not associated with any obvious lexicalized labels. We show that LVLMs cannot interactively generate and resolve referring expressions in a way that enables smooth communication, a crucial skill that underlies human language use. We release our corpus of 356 dialogues (89 pairs over 4 rounds each) along with the online pipeline for data collection and the tools for analyzing accuracy, efficiency, and lexical overlap.

Anthology ID:: 2026.acl-long.410
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 9061–9087
Language:
URL:: https://aclanthology.org/2026.acl-long.410/
DOI:
Bibkey:
Cite (ACL):: Peter Zeng, Weiling Li, Amie J. Paige, Zhengxiang Wang, Panagiotis Kaliosis, Dimitris Samaras, Gregory J. Zelinsky, Susan Brennan, and Owen Rambow. 2026. LVLMs and Humans Ground Differently in Referential Communication. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 9061–9087, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: LVLMs and Humans Ground Differently in Referential Communication (Zeng et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.410.pdf
Checklist:: 2026.acl-long.410.checklist.pdf

PDF Cite Search Checklist Fix data