Yiming Zhang
Other people with similar names: Yiming Zhang, Yiming Zhang, Yiming Zhang (CMU)
Unverified author pages with similar names: Yiming Zhang
2026
Fine-Grained Data Ordering Improves Fine-Tuning for Large Language Models
Xiaomeng Hu | Yixuan Tang | Haoze Li | Hao Chen | Qi Zhang | Zhanming Shen | Yiming Zhang | Haobo Wang | Junbo Zhao
Findings of the Association for Computational Linguistics: ACL 2026
Xiaomeng Hu | Yixuan Tang | Haoze Li | Hao Chen | Qi Zhang | Zhanming Shen | Yiming Zhang | Haobo Wang | Junbo Zhao
Findings of the Association for Computational Linguistics: ACL 2026
With the rapid progress of large language models (LLMs), aligning a general-purpose model with downstream tasks through fine-tuning has become a central research focus. Selecting only high-quality examples for training has been shown to be one of the most effective ways to improve fine-tuning performance. However, prior work concentrates almost exclusively on data preprocessing: filtering and cleaning data before training begins. While the order and composition of training data during training have received little fine-grained attention. To fill this gap, our work proposed Fine-Grained Order Fine-Tuning, a fine-grained scheduling method of data order in epochs. Drawing on curriculum-learning principles, FOT defines data difficulty based on the relevance between the data and the model, and then performs dynamic scheduling of the training order in each epoch according to the difficulty. On both large-scale continued pre-training and small-scale supervised fine-tuning experiments, FOT has achieved an average 2.4% improvement over baselines. Our study offers a new perspective on data governance in the fine-tuning phase.
Can LLMs Act as Historians? Evaluating Historical Research Capabilities of LLMs via the Chinese Imperial Examination
Lirong Gao | Zeqing Wang | Yuyan Cai | Jiayi Deng | Yanmei Gu | Yiming Zhang | Jia Zhou | Yanfei Zhang | Junbo Zhao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Lirong Gao | Zeqing Wang | Yuyan Cai | Jiayi Deng | Yanmei Gu | Yiming Zhang | Jia Zhou | Yanfei Zhang | Junbo Zhao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
While Large Language Models (LLMs) have increasingly assisted in historical tasks such as text processing, their capacity for professional-level historical reasoning remains underexplored. Existing benchmarks primarily assess basic knowledge breadth or lexical understanding, failing to capture the higher-order skills—such as evidentiary reasoning—that are central to historical research. To fill this gap, we introduce ProHist-Bench, a novel benchmark anchored in the Chinese Imperial Examination (Keju) system—a comprehensive microcosm of East Asian political, social, and intellectual history spanning over 1,300 years. Developed through deep interdisciplinary collaboration, ProHist-Bench features 400 challenging, expert-curated questions across eight dynasties, accompanied by 10,891 fine-grained evaluation rubrics. Through a rigorous evaluation of 18 LLMs, we reveal a significant proficiency gap: even state-of-the-art LLMs struggle with complex historical research questions. We hope ProHist-Bench will facilitate the development of domain-specific reasoning LLMs, advance computational historical research, and further uncover the untapped potential of LLMs. We release ProHist-Bench at https://github.com/inclusionAI/ABench/tree/main/ProHist-Bench.