Weidong Zhou
2026
KnowDR-REC: Auditing Knowledge-Conditioned Visual Grounding in Referring Expression Comprehension
Guanghao Jin | Jingpei Wu | Tianpei Guo | Yiyi Niu | Weidong Zhou | Linyi Yang | Guoyang Liu
Findings of the Association for Computational Linguistics: ACL 2026
Guanghao Jin | Jingpei Wu | Tianpei Guo | Yiyi Niu | Weidong Zhou | Linyi Yang | Guoyang Liu
Findings of the Association for Computational Linguistics: ACL 2026
While Multimodal Large Language Models (MLLMs) have demonstrated the capacity for multi-modal reasoning, current Referring Expression Comprehension (REC) benchmarks lag behind, predominantly relying on intra-image cues and neglecting the integration of external world knowledge, which significantly impedes the evolution of REC towards real-world applications. This limitation obscures a model’s true capability to conduct textual reasoning (entity resolution), resolve spatial location (visual grounding), and verify reference validity (hallucination rejection). To address this, we introduce KnowDR-REC, a targeted audit benchmark comprising 1,042 positive triplets derived from real-world knowledge, along with rigorously matched negative samples. Unlike traditional datasets, we implement a controllable counterfactual evaluation mechanism that subjects textual expressions to single-factor perturbations (entity, relation, or time) to test sensitivity to fine-grained factual changes. Extensive evaluation of 18 state-of-the-art LMMs exposes a critical “binding hallucination,” revealing that current high performance is often built on fragile visual shortcuts rather than true understanding. KnowDR-REC thus serves as a pivotal diagnostic instrument, steering future research toward the genuine integration of perception and reasoning.
TiKMiX: Efficient Semi-Dynamic Data Mixture via Data Influence for LLM Pre-training
Yifan Wang | Binbinliu | Fengze Liu | Yuanfan Guo | Jiyao Deng | Xuecheng Wu | Weidong Zhou | Xiaohuan Zhou | Taifeng Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yifan Wang | Binbinliu | Fengze Liu | Yuanfan Guo | Jiyao Deng | Xuecheng Wu | Weidong Zhou | Xiaohuan Zhou | Taifeng Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The data mixture used in the pre-training of a language model is a cornerstone of its final performance. Static data mixing strategies in Large Language Model (LLM) pre-training are often suboptimal as they fail to adapt to the model’s evolving learning states. Conversely, fully online dynamic updates, while adaptive, incur prohibitive computational costs. To bridge this gap, we propose TiKMiX, an efficient semi-dynamic data mixing framework. Our approach is grounded in a key observation of influence ranking invariance: the relative importance of data domains exhibits strong temporal stability over long training intervals. Leveraging this insight, we propose Group Influence, an efficient approach for quantifying domain impact, and formulate data mixing as a periodic, low-overhead influence maximization problem. Compared with REGMIX, the proposed method reduces computational overhead by 80% and achieves an average performance gain of 2% across nine downstream benchmarks, thereby effectively mitigating data under-digestion.