FactAlign: Long-form Factuality Alignment of Large Language Models

Chao-Wei Huang, Yun-Nung Chen


Abstract
Large language models have demonstrated significant potential as the next-generation information access engines. However, their reliability is hindered by issues of hallucination and generating non-factual content. This is particularly problematic in long-form responses, where assessing and ensuring factual accuracy is complex. In this paper, we address this gap by proposing FactAlign, a novel alignment framework designed to enhance the factuality of LLMs’ long-form responses while maintaining their helpfulness. We introduce fKTO, a fine-grained, sentence-level alignment algorithm that extends the Kahneman-Tversky Optimization (KTO) alignment method. Leveraging recent advances in automatic factuality evaluation, FactAlign utilizes fine-grained factuality assessments to guide the alignment process. Our experiments on open-domain prompts and information-seeking questions demonstrate that FactAlign significantly improves the factual accuracy of LLM responses while also improving their helpfulness. Further analyses identify that FactAlign is capable of training LLMs to provide more information without losing factual precision, thus improving the factual F1 score. Our source code, datasets, and trained models are publicly available at https://github.com/MiuLab/FactAlign
Anthology ID:
2024.findings-emnlp.955
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
16363–16375
Language:
URL:
https://aclanthology.org/2024.findings-emnlp.955
DOI:
10.18653/v1/2024.findings-emnlp.955
Bibkey:
Cite (ACL):
Chao-Wei Huang and Yun-Nung Chen. 2024. FactAlign: Long-form Factuality Alignment of Large Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 16363–16375, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
FactAlign: Long-form Factuality Alignment of Large Language Models (Huang & Chen, Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-emnlp.955.pdf