Yihang Wang
2025
MDPO: Customized Direct Preference Optimization with a Metric-based Sampler for Question and Answer Generation
Yihang Wang
|
Bowen Tian
|
Yueyang Su
|
Yixing Fan
|
Jiafeng Guo
Proceedings of the 31st International Conference on Computational Linguistics
With the extensive use of large language models, automatically generating QA datasets for domain-specific fine-tuning has become crucial. However, considering the multifaceted demands for readability, diversity, and comprehensiveness of QA data, current methodologies fall short in producing high-quality QA datasets. Moreover, the dependence of existing evaluation metrics on ground truth labels further exacerbates the challenges associated with the selection of QA data. In this paper, we introduce a novel method for QA data generation, denoted as MDPO. We proposes a set of unsupervised evaluation metrics for QA data, enabling multidimensional assessment based on the relationships among context,question and answer. Furthermore, leveraging these metrics, we implement a customized direct preference optimization process that guides large language models to produce high-quality and domain-specific QA pairs. Empirical results on public datasets indicate that MDPO’s performance substantially surpasses that of state-of-the-art methods.
QUITO-X: A New Perspective on Context Compression from the Information Bottleneck Theory
Yihang Wang
|
Xu Huang
|
Bowen Tian
|
Yueyang Su
|
Lei Yu
|
Huaming Liao
|
Yixing Fan
|
Jiafeng Guo
|
Xueqi Cheng
Findings of the Association for Computational Linguistics: EMNLP 2025
Generative large language models ( LLMs) have achieved remarkable success in various industrial applications, owing to their promising In-Context Learning capabilities. However, the issue of long context in complex tasks poses a significant barrier to their wider adoption, manifested in two main aspects: (i) The excessively long context leads to high costs and inference delays. (ii) A substantial amount of task-irrelevant information introduced by long contexts exacerbates the “lost in the middle” problem. Existing methods compress context by removing redundant tokens using metrics such as self-information or perplexity ( PPL ), which is inconsistent with the objective of retaining the most important tokens when conditioning on a given query. In this study, we introduce information bottleneck theory (IB) to model the problem, offering a novel perspective that thoroughly addresses the essential properties required for context compression. Additionally, we propose a cross-attention-based approach to approximate mutual information in IB, which can be flexibly replaced with suitable alternatives in different scenarios. Extensive experiments on four datasets demonstrate that our method achieves a 25% increase in compression rate compared to the state-of-the-art, while maintaining question answering performance. In particular, the context compressed by our method even outperform the full context in some cases.
Search
Fix author
Co-authors
- Yixing Fan (意兴 范) 2
- Jiafeng Guo (嘉丰 郭) 2
- Yueyang Su 2
- Bowen Tian 2
- Xueqi Cheng (程学旗) 1
- show all...