Mingjie Zhong
2026
GAST: Gradient-aligned Sparse Tuning of Large Language Models with Data-layer Selection
Kai Yao | Zhenghan Song | Kaixin Wu | Mingjie Zhong | Danzhao Cheng | Zhaorui Tan | Yixin Ji | Penglei Gao
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Kai Yao | Zhenghan Song | Kaixin Wu | Mingjie Zhong | Danzhao Cheng | Zhaorui Tan | Yixin Ji | Penglei Gao
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Parameter-Efficient Fine-Tuning (PEFT) has become a key strategy for adapting large language models, with recent advances in sparse tuning reducing overhead by selectively updating key parameters or subsets of data. Existing approaches generally focus on two distinct paradigms: layer-selective methods aiming to fine-tune critical layers to minimize computational load, and data-selective methods aiming to select effective training subsets to boost training. However, current methods typically overlook the fact that different data points contribute varying degrees to distinct model layers, and they often discard potentially valuable information from data perceived as of low quality. To address these limitations, we propose Gradient-aligned Sparse Tuning (GAST), an innovative method that simultaneously performs selective fine-tuning at both data and layer dimensions as integral components of a unified optimization strategy. GAST specifically targets redundancy in information by employing a layer-sparse strategy that adaptively selects the most impactful data points for each layer, providing a more comprehensive and sophisticated solution than approaches restricted to a single dimension. Experiments demonstrate that GAST consistently outperforms baseline methods, establishing a promising direction for future research in PEFT strategies.
2025
Towards Boosting LLMs-driven Relevance Modeling with Progressive Retrieved Behavior-augmented Prompting
Zeyuan Chen | Haiyan Wu | Kaixin Wu | Wei Chen | Mingjie Zhong | Jia Xu | Zhongyi Liu | Wei Zhang
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track
Zeyuan Chen | Haiyan Wu | Kaixin Wu | Wei Chen | Mingjie Zhong | Jia Xu | Zhongyi Liu | Wei Zhang
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track
This paper studies the relevance modeling problem by integrating world knowledge stored in the parameters of LLMs with specialized domain knowledge represented by user behavior data for achieving promising performance. The novel framework ProRBP is proposed, which innovatively develops user-driven behavior neighbor retrieval module to learn domain-specific knowledge in time and introduces progressive prompting and aggregation module for considering diverse aspects of the relevance and prediction stability. We explore an industrial implementation to deploy LLMs to handle full-scale search traffics of Alipay with acceptable cost and latency. The comprehensive experiments on real-world industry data and online A/B testing validate the superiority of our proposal and the effectiveness of its main modules.
2024
Retrieval and Reasoning on KGs: Integrate Knowledge Graphs into Large Language Models for Complex Question Answering
Yixin Ji | Kaixin Wu | Juntao Li | Wei Chen | Mingjie Zhong | Xu Jia | Min Zhang
Findings of the Association for Computational Linguistics: EMNLP 2024
Yixin Ji | Kaixin Wu | Juntao Li | Wei Chen | Mingjie Zhong | Xu Jia | Min Zhang
Findings of the Association for Computational Linguistics: EMNLP 2024
Despite Large Language Models (LLMs) have performed impressively in various Natural Language Processing (NLP) tasks, their inherent hallucination phenomena severely challenge their credibility in complex reasoning. Combining explainable Knowledge Graphs (KGs) with LLMs is a promising path to address this issue. However, structured KGs are difficult to utilize, and how to make LLMs understand and incorporate them is a challenging topic. We thereby reorganize a more efficient structure of KGs, while designing the KG-related instruction tuning and continual pre-training strategies to enable LLMs to learn and internalize this form of representation effectively. Moreover, we construct subgraphs to further enhance the retrieval capabilities of KGs via CoT reasoning. Extensive experiments on two KGQA datasets demonstrate that our model achieves convincing performance compared to strong baselines.