FAITH: Factuality Alignment through Integrating Trustworthiness and Honestness

Xiaoning Dong; Chengyan Wu; Yajie Wen; Yu Chen; Yun Xue (薛云); Zhang Jing; Wei Xu; Bolei Ma

FAITH: Factuality Alignment through Integrating Trustworthiness and Honestness

Xiaoning Dong, Chengyan Wu, Yajie Wen, Yu Chen, Yun Xue, Zhang Jing, Wei Xu, Bolei Ma

Abstract

Large Language Models (LLMs) can generate factually inaccurate content even if they have corresponding knowledge, which critically undermines their reliability. Existing approaches attempt to mitigate this by incorporating uncertainty in QA prompt during training, but these numerical scores lack the semantic richness for LLM to properly understand its internal states of trustworthiness and honestness, leading to insufficient factuality alignment. We introduce FAITH (Factuality Alignment through Integrating Trustworthiness and Honestness), a post-training framework for factuality alignment that integrates natural-language uncertainty signals with external knowledge. Specifically, we augment training datasets by computing confidence scores and semantic entropy from LLM outputs and mapping them into a knowledge state quadrant that describes the model’s internal knowledge possession (trustworthiness) and answering behaviors (honestness) in natural language. Based on this enhanced data, we design a reward function that considers both correctness and uncertainty signals, and fine-tune the LLM using the Proximal Policy Optimization (PPO) algorithm. To further mitigate weakly grounded responses, we design a retrieval-augmented module that retrieves relevant external passages, improving the consistency between internal and external knowledge representations. Extensive experiments on four knowledge-intensive benchmarks demonstrate that FAITH enhances the factual accuracy and truthfulness of LLMs.

Anthology ID:: 2026.findings-acl.684
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 13975–13994
Language:
URL:: https://aclanthology.org/2026.findings-acl.684/
DOI:
Bibkey:
Cite (ACL):: Xiaoning Dong, Chengyan Wu, Yajie Wen, Yu Chen, Yun Xue, Zhang Jing, Wei Xu, and Bolei Ma. 2026. FAITH: Factuality Alignment through Integrating Trustworthiness and Honestness. In Findings of the Association for Computational Linguistics: ACL 2026, pages 13975–13994, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: FAITH: Factuality Alignment through Integrating Trustworthiness and Honestness (Dong et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.684.pdf
Checklist:: 2026.findings-acl.684.checklist.pdf

PDF Cite Search Checklist Fix data