Meikang Qiu
2026
EgoMemory: Memory-Augmented Personalized Retrieval for Long-Context Egocentric Video
Yuanmin Tang | Jue Zhang | Xiaoting Qin | Jing Yu | Meikang Qiu | Gaopeng Gou | Gang Xiong | Qingwei Lin | Saravan Rajmohan | Dongmei Zhang | Qi Wu
Findings of the Association for Computational Linguistics: ACL 2026
Yuanmin Tang | Jue Zhang | Xiaoting Qin | Jing Yu | Meikang Qiu | Gaopeng Gou | Gang Xiong | Qingwei Lin | Saravan Rajmohan | Dongmei Zhang | Qi Wu
Findings of the Association for Computational Linguistics: ACL 2026
Recent advances in AI and wearable devices, such as augmented-reality glasses, have made it possible to augment human memory by retrieving personal experiences in response to natural language queries. However, existing egocentric video datasets fall short in supporting the personalization and long-context reasoning required for episodic memory retrieval. To address these limitations, we introduce EgoMemory, a benchmark derived from Ego4D, enriched with 165,795 user-specific object annotations over 245 videos from 45 participants, yielding 639 distinct, human-curated, and evaluated queries for rich and individualized episodic memory retrieval. Leveraging this resource, we present EgoRetriever, a novel, training-free retrieval framework that combines Multimodal Large Language Models with reflective Chain-of-Thought prompting. Our approach enables interpretive inference of user intent and generates detailed target video descriptions by leveraging contextualized personal memory for video retrieval. Extensive experiments on three benchmarks, including EgoMemory, EgoCVR, and EgoLife, demonstrate that EgoRetriever consistently and substantially outperforms state-of-the-art baselines, highlighting its strong generalizability and practical potential for personalized, long-context egocentric video retrieval.
MultiFinBen: Benchmarking Large Language Models for Multilingual and Multimodal Financial Application
Xueqing Peng | Lingfei Qian | Yan Wang | Ruoyu Xiang | Yueru He | Yang Ren | Mingyang Jiang | Vincent Jim Zhang | Yuqing Guo | Jeff Zhao | Huan He | Yi Han | Yun Feng | Yuechen Jiang | Yupeng Cao | Haohang Li | Yangyang Yu | Xiaoyu Wang | Penglei Gao | Shengyuan Lin | Keyi Wang | Shanshan Yang | Yilun Zhao | Zhiwei Liu | Peng Lu | Jerry Huang | Suyuchen Wang | Triantafillos Papadopoulos | Polydoros Giannouris | Efstathia Soufleri | Nuo Chen | Zhiyang Deng | Heming Fu | Yijia Zhao | Mingquan Lin | Meikang Qiu | Kaleb E Smith | Arman Cohan | Xiao-Yang Liu | Jimin Huang | Guojun Xiong | Alejandro Lopez-Lira | Xi Chen | Junichi Tsujii | Jian-Yun Nie | Sophia Ananiadou | Qianqian Xie
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Xueqing Peng | Lingfei Qian | Yan Wang | Ruoyu Xiang | Yueru He | Yang Ren | Mingyang Jiang | Vincent Jim Zhang | Yuqing Guo | Jeff Zhao | Huan He | Yi Han | Yun Feng | Yuechen Jiang | Yupeng Cao | Haohang Li | Yangyang Yu | Xiaoyu Wang | Penglei Gao | Shengyuan Lin | Keyi Wang | Shanshan Yang | Yilun Zhao | Zhiwei Liu | Peng Lu | Jerry Huang | Suyuchen Wang | Triantafillos Papadopoulos | Polydoros Giannouris | Efstathia Soufleri | Nuo Chen | Zhiyang Deng | Heming Fu | Yijia Zhao | Mingquan Lin | Meikang Qiu | Kaleb E Smith | Arman Cohan | Xiao-Yang Liu | Jimin Huang | Guojun Xiong | Alejandro Lopez-Lira | Xi Chen | Junichi Tsujii | Jian-Yun Nie | Sophia Ananiadou | Qianqian Xie
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Real-world financial analysis involves information across multiple languages and modalities, from reports and news to scanned filings and meeting recordings. Yet most existing evaluations of LLMs in finance remain text-only, monolingual, and largely saturated by current models. To bridge these gaps, we present MultiFinBen, the first expert-annotated multilingual (five languages) and multimodal (text, vision, audio) benchmark for evaluating LLMs in realistic financial contexts. MultiFinBen introduces two new task families: multilingual financial reasoning, which tests cross-lingual evidence integration from filings and news, and financial OCR, which extracts structured text from scanned documents containing tables and charts. Rather than aggregating all available datasets, we apply a structured, difficulty-aware selection based on advanced model performance, ensuring balanced challenge and removing redundant tasks. Evaluating 21 leading LLMs shows that even frontier multimodal models like GPT-4o achieve only 46.01% overall, stronger on vision and audio but dropping sharply in multilingual settings. These findings expose persistent limitations in multilingual, multimodal, and expert-level financial reasoning. All datasets, evaluation scripts, and leaderboards are publicly released.
Search
Fix author
Co-authors
- Sophia Ananiadou 1
- Yupeng Cao 1
- Nuo Chen 1
- Xi Chen 1
- Arman Cohan 1
- Zhiyang Deng 1
- Yun Feng 1
- Heming Fu 1
- Penglei Gao 1
- Polydoros Giannouris 1
- Gaopeng Gou 1
- Yuqing Guo 1
- Yi Han 1
- Huan He 1
- Yueru He 1
- Jerry Huang 1
- Jimin Huang 1
- Mingyang Jiang 1
- Yuechen Jiang 1
- Haohang Li 1
- Mingquan Lin 1
- Qingwei Lin 1
- Shengyuan Lin 1
- Xiao-Yang Liu 1
- Zhiwei Liu 1
- Alejandro Lopez-Lira 1
- Peng Lu 1
- Jian-Yun Nie 1
- Triantafillos Papadopoulos 1
- Xueqing Peng 1
- Lingfei Qian 1
- Xiaoting Qin 1
- Saravan Rajmohan 1
- Yang Ren 1
- Kaleb E. Smith 1
- Efstathia Soufleri 1
- Yuanmin Tang 1
- Jun’ichi Tsujii 1
- Keyi Wang 1
- Suyuchen Wang 1
- Xiaoyu Wang 1
- Yan Wang 1
- Qi Wu 1
- Ruoyu Xiang 1
- Qianqian Xie 1
- Gang Xiong 1
- Guojun Xiong 1
- Shanshan Yang 1
- Jing Yu 1
- Yangyang Yu 1
- Dongmei Zhang 1
- Jue Zhang 1
- Vincent Jim Zhang 1
- Jeff Zhao 1
- Yijia Zhao 1
- Yilun Zhao 1