Hung-Chieh Fang
2024
Editing the Mind of Giants: An In-Depth Exploration of Pitfalls of Knowledge Editing in Large Language Models
Cheng-Hsun Hsueh
|
Paul Kuo-Ming Huang
|
Tzu-Han Lin
|
Che Wei Liao
|
Hung-Chieh Fang
|
Chao-Wei Huang
|
Yun-Nung Chen
Findings of the Association for Computational Linguistics: EMNLP 2024
Knowledge editing is a rising technique for efficiently updating factual knowledge in large language models (LLMs) with minimal alteration of parameters. However, recent studies have identified side effects, such as knowledge distortion and the deterioration of general abilities, that have emerged after editing. Despite these findings, evaluating the pitfalls of knowledge editing often relies on inconsistent metrics and benchmarks, lacking a uniform standard. In response, this survey presents a comprehensive study of these side effects, providing a unified perspective on the challenges of knowledge editing in LLMs by conducting experiments with consistent metrics and benchmarks. Additionally, we review related works and outline potential research directions to address these limitations. Our survey highlights the limitations of current knowledge editing methods, emphasizing the need for a deeper understanding of the inner knowledge structures of LLMs and improved knowledge editing methods. To foster future research, we have released the complementary materials publicly (https://github.com/MiuLab/EditLLM-Survey).
2022
Open-Domain Conversational Question Answering with Historical Answers
Hung-Chieh Fang
|
Kuo-Han Hung
|
Chen-Wei Huang
|
Yun-Nung Chen
Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022
Open-domain conversational question answering can be viewed as two tasks: passage retrieval and conversational question answering, where the former relies on selecting candidate passages from a large corpus and the latter requires better understanding of a question with contexts to predict the answers. This paper proposes ConvADR-QA that leverages historical answers to boost retrieval performance and further achieves better answering performance. Our experiments on the benchmark dataset, OR-QuAC, demonstrate that our model outperforms existing baselines in both extractive and generative reader settings, well justifying the effectiveness of historical answers for open-domain conversational question answering.
Search
Co-authors
- Yun-Nung Chen 2
- Cheng-Hsun Hsueh 1
- Paul Kuo-Ming Huang 1
- Tzu-Han Lin 1
- Che-Wei Liao 1
- show all...