A Video Is Worth 4096 Tokens: Verbalize Videos To Understand Them In Zero Shot Aanisha Bhattacharyya author Yaman K Singla author Balaji Krishnamurthy author Rajiv Ratn Shah author Changyou Chen author 2023-12 text Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing Houda Bouamor editor Juan Pino editor Kalika Bali editor Association for Computational Linguistics Singapore conference publication bhattacharyya-etal-2023-video 10.18653/v1/2023.emnlp-main.608 https://aclanthology.org/2023.emnlp-main.608/ 2023-12 9822 9839