VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs Ruotong Liao author Max Erler author Huiyu Wang author Guangyao Zhai author Gengyuan Zhang author Yunpu Ma author Volker Tresp author 2024-11 text Findings of the Association for Computational Linguistics: EMNLP 2024 Yaser Al-Onaizan editor Mohit Bansal editor Yun-Nung Chen editor Association for Computational Linguistics Miami, Florida, USA conference publication liao-etal-2024-videoinsta 10.18653/v1/2024.findings-emnlp.384 https://aclanthology.org/2024.findings-emnlp.384/ 2024-11 6577 6602