Shihao Liu
2025
Utility-Focused LLM Annotation for Retrieval and Retrieval-Augmented Generation
Hengran Zhang
|
Minghao Tang
|
Keping Bi
|
Jiafeng Guo
|
Shihao Liu
|
Daiting Shi
|
Dawei Yin
|
Xueqi Cheng
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
This paper explores the use of large language models (LLMs) for annotating document utility in training retrieval and retrieval-augmented generation (RAG) systems, aiming to reduce dependence on costly human annotations. We address the gap between retrieval relevance and generative utility by employing LLMs to annotate document utility. To effectively utilize multiple positive samples per query, we introduce a novel loss that maximizes their summed marginal likelihood. Using the Qwen-2.5-32B model, we annotate utility on the MS MARCO dataset and conduct retrieval experiments on MS MARCO and BEIR, as well as RAG experiments on MS MARCO QA, NQ, and HotpotQA. Our results show that LLM-generated annotations enhance out-of-domain retrieval performance and improve RAG outcomes compared to models trained solely on human annotations or downstream QA metrics. Furthermore, combining LLM annotations with just 20% of human labels achieves performance comparable to using full human annotations. Our study offers a comprehensive approach to utilizing LLM annotations for initializing QA systems on new corpora.
RACQC: Advanced Retrieval-Augmented Generation for Chinese Query Correction
Jinbo Su
|
Lingzhe Gao
|
Wei Li
|
Shihao Liu
|
Haojie Lei
|
Xinyi Wang
|
Yuanzhao Guo
|
Ke Wang
|
Daiting Shi
|
Dawei Yin
Findings of the Association for Computational Linguistics: EMNLP 2025
In web search scenarios, erroneous queries frequently degrade users’ experience through irrelevant results, underscoring the pivotal role of Chinese Spelling Check (CSC) systems. Although large language models (LLMs) exhibit remarkable capabilities across many tasks, they face critical challenges in the CSC scenario: (1) poor generalization to rare entities in open-domain searches, and (2) failure to adapt to temporal entity variations due to static parameters, resulting in serious over-correction issues. To tackle this, we present RACQC, a **C**hinese **Q**uery **C**orrection system with **R**etrieval-**A**ugmented Generation(RAG) and multi-task learning. Specifically, our approach (1) integrates dynamic knowledge retrieval through entity-centric RAG to address rare entities and innovatively proposes an entity-title collaborative corpus, and (2) employs contrastive correction tasks to mitigate LLM over-correction tendencies. Furthermore, we propose MDCQC, a **M**ulti-**D**omain **C**hinese **Q**uery **C**orrection benchmark to test the model’s entity correction capabilities. Extensive experiments on several datasets show that RACQC significantly outperforms existing baselines in CSC tasks. Specifically, RACQC achieves a maximum improvement of +9.92% on the search scenario benchmark and +3.2% on the general-domain dataset under the F1 metric.
Search
Fix author
Co-authors
- Daiting Shi 2
- Dawei Yin 2
- Keping Bi 1
- Xueqi Cheng (程学旗) 1
- Lingzhe Gao 1
- show all...