Hithesh Sankararaman
2024
Provenance: A Light-weight Fact-checker for Retrieval Augmented LLM Generation Output
Hithesh Sankararaman
|
Mohammed Nasheed Yasin
|
Tanner Sorensen
|
Alessandro Di Bari
|
Andreas Stolcke
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track
We present a light-weight approach for detecting nonfactual outputs from retrieval-augemented generation (RAG). Given a context and putative output, we compute a factuality score that can be thresholded to yield a binary decision to check the results of LLM-based question-answering, summarization, or other systems. Unlike factuality checkers that themselves rely on LLMs, we use compact, open-source natural language inference (NLI) models that yield a freely accessible solution with low latency and low cost at run-time, and no need for LLM fine-tuning. The approach also enables downstream mitigation and correction of hallucinations, by tracing them back to specific context chunks. Our experiments show high ROC-AUC across a wide range of relevant open source datasets, indicating the effectiveness of our method for fact-checking RAG output.
2023
Controllable Discovery of Intents: Incremental Deep Clustering Using Semi-Supervised Contrastive Learning
Mrinal Rawat
|
Hithesh Sankararaman
|
Victor Barres
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Search
Fix data
Co-authors
- Alessandro Di Bari 1
- Victor Barres 1
- Mrinal Rawat 1
- Tanner Sorensen 1
- Andreas Stolcke 1
- show all...