Worldwide LiveVQA: Real-Time Visual Knowledge Seeking and Updating Across Languages

Xuanao Huang; Xingjia Liu; Zetong Zhou; Yuyang Peng; Yao Wan; Dongping Chen

doi:10.18653/v1/2026.findings-acl.1984

Worldwide LiveVQA: Real-Time Visual Knowledge Seeking and Updating Across Languages

Xuanao Huang, Xingjia Liu, Zetong Zhou, Yuyang Peng, Yao Wan, Dongping Chen

Abstract

Knowledge about the visual world is not only constantly evolving but also inherently happening all over the world: breaking news in Tokyo, political events in São Paulo, and cultural phenomena in Cairo are first reported in Japanese, Portuguese, and Arabic, carrying regional context that English-centric resources cannot fully capture. Yet existing resources for visual knowledge remain confined to English, creating a "Worldwide Knowledge Gap" that hinders developing truly global assistants. To quantify this gap, we introduce LiveVQA-W(orldwide), the first dynamic-updating dataset for real-time, multilingual visual knowledge seeking and updating across ten major languages. Drawing from worldwide news outlets, YouTube videos, and academic platforms during August–December 2025, LiveVQA-W comprises 234K images, 873K questions, and 171K visual entities with hierarchical evaluation: Level 1 for visual entity recognition and Level 2 for multi-hop cross-lingual reasoning. Our comprehensive benchmarking of 15 state-of-the-art MLLMs reveals that models without search achieve near-random performance, while search-augmented models exhibit severe linguistic bias, with English accuracy nearly double that of other languages. Furthermore, we explore visual knowledge updating through large-scale training, finding that injected knowledge improves recall but remains fragile under prompt rephrasing and image perturbations such as rotation and flipping. We release the fully replicable data collection pipeline and raw dataset to support continuous community-driven expansion. The benchmark, code, and related resources are available at: https://worldwide-livevqa.github.io.

Anthology ID:: 2026.findings-acl.1984
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 39819–39894
Language:
URL:: https://aclanthology.org/2026.findings-acl.1984/
DOI:: 10.18653/v1/2026.findings-acl.1984
Bibkey:
Cite (ACL):: Xuanao Huang, Xingjia Liu, Zetong Zhou, Yuyang Peng, Yao Wan, and Dongping Chen. 2026. Worldwide LiveVQA: Real-Time Visual Knowledge Seeking and Updating Across Languages. In Findings of the Association for Computational Linguistics: ACL 2026, pages 39819–39894, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Worldwide LiveVQA: Real-Time Visual Knowledge Seeking and Updating Across Languages (Huang et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.1984.pdf
Checklist:: 2026.findings-acl.1984.checklist.pdf

PDF Cite Search Checklist Fix data