Mingjie Zhong


2025

pdf bib
Towards Boosting LLMs-driven Relevance Modeling with Progressive Retrieved Behavior-augmented Prompting
Zeyuan Chen | Haiyan Wu | Kaixin Wu | Wei Chen | Mingjie Zhong | Jia Xu | Zhongyi Liu | Wei Zhang
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track

This paper studies the relevance modeling problem by integrating world knowledge stored in the parameters of LLMs with specialized domain knowledge represented by user behavior data for achieving promising performance. The novel framework ProRBP is proposed, which innovatively develops user-driven behavior neighbor retrieval module to learn domain-specific knowledge in time and introduces progressive prompting and aggregation module for considering diverse aspects of the relevance and prediction stability. We explore an industrial implementation to deploy LLMs to handle full-scale search traffics of Alipay with acceptable cost and latency. The comprehensive experiments on real-world industry data and online A/B testing validate the superiority of our proposal and the effectiveness of its main modules.

2024

pdf bib
Retrieval and Reasoning on KGs: Integrate Knowledge Graphs into Large Language Models for Complex Question Answering
Yixin Ji | Kaixin Wu | Juntao Li | Wei Chen | Mingjie Zhong | Xu Jia | Min Zhang
Findings of the Association for Computational Linguistics: EMNLP 2024

Despite Large Language Models (LLMs) have performed impressively in various Natural Language Processing (NLP) tasks, their inherent hallucination phenomena severely challenge their credibility in complex reasoning. Combining explainable Knowledge Graphs (KGs) with LLMs is a promising path to address this issue. However, structured KGs are difficult to utilize, and how to make LLMs understand and incorporate them is a challenging topic. We thereby reorganize a more efficient structure of KGs, while designing the KG-related instruction tuning and continual pre-training strategies to enable LLMs to learn and internalize this form of representation effectively. Moreover, we construct subgraphs to further enhance the retrieval capabilities of KGs via CoT reasoning. Extensive experiments on two KGQA datasets demonstrate that our model achieves convincing performance compared to strong baselines.