Yuchun Fan
2024
Advancing Large Language Model Attribution through Self-Improving
Lei Huang
|
Xiaocheng Feng
|
Weitao Ma
|
Liang Zhao
|
Yuchun Fan
|
Weihong Zhong
|
Dongliang Xu
|
Qing Yang
|
Hongtao Liu
|
Bing Qin
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Teaching large language models (LLMs) to generate text with citations to evidence sources can mitigate hallucinations and enhance verifiability in information-seeking systems. However, improving this capability requires high-quality attribution data, which is costly and labor-intensive. Inspired by recent advances in self-improvement that enhance LLMs without manual annotation, we present START, a Self-Taught AttRibuTion framework for iteratively improving the attribution capability of LLMs. First, to prevent models from stagnating due to initially insufficient supervision signals, START leverages the model to self-construct synthetic training data for warming up. To further self-improve the model’s attribution ability, START iteratively utilizes fine-grained preference supervision signals constructed from its sampled responses to encourage robust, comprehensive, and attributable generation. Experiments on three open-domain question-answering datasets, covering long-form QA and multi-step reasoning, demonstrate significant performance gains of 25.13% on average without relying on human annotations and more advanced models. Further analysis reveals that START excels in aggregating information across multiple sources.
2023
Augmenting Large Language Model Translators via Translation Memories
Yongyu Mu
|
Abudurexiti Reheman
|
Zhiquan Cao
|
Yuchun Fan
|
Bei Li
|
Yinqiao Li
|
Tong Xiao
|
Chunliang Zhang
|
Jingbo Zhu
Findings of the Association for Computational Linguistics: ACL 2023
Using translation memories (TMs) as prompts is a promising approach to in-context learning of machine translation models. In this work, we take a step towards prompting large language models (LLMs) with TMs and making them better translators. We find that the ability of LLMs to “understand” prompts is indeed helpful for making better use of TMs. Experiments show that the results of a pre-trained LLM translator can be greatly improved by using high-quality TM-based prompts. These results are even comparable to those of the state-of-the-art NMT systems which have access to large-scale in-domain bilingual data and are well tuned on the downstream tasks.
Search
Co-authors
- Lei Huang 1
- Xiaocheng Feng 1
- Weitao Ma 1
- Liang Zhao 1
- Weihong Zhong 1
- show all...