Georgi N. Georgiev
2025
OpenFactCheck: Building, Benchmarking Customized Fact-Checking Systems and Evaluating the Factuality of Claims and LLMs
Yuxia Wang
|
Minghan Wang
|
Hasan Iqbal
|
Georgi N. Georgiev
|
Jiahui Geng
|
Iryna Gurevych
|
Preslav Nakov
Proceedings of the 31st International Conference on Computational Linguistics
The increased use of large language models (LLMs) across a variety of real-world applications calls for mechanisms to verify the fac- tual accuracy of their outputs. Difficulties lie in assessing the factuality of free-form responses in open domains. Also, different pa- pers use disparate evaluation benchmarks and measurements, which renders them hard to compare and hampers future progress. To mitigate these issues, we propose OpenFactCheck, a unified framework for building customized automatic fact-checking systems, benchmarking their accuracy, evaluating factuality of LLMs, and verifying claims in a document. OpenFactCheck consists of three modules: (i) CUSTCHECKER allows users to easily customize an automatic fact-checker and verify the factual correctness of documents and claims, (ii) LLMEVAL, a unified evaluation framework assesses LLM’s factuality ability from various perspectives fairly, and (iii) CHECKEREVAL is an extensible solution for gauging the reliability of automatic fact-checkers’ verification results using human-annotated datasets. Data and code are publicly available at https: //github.com/yuxiaw/openfactcheck.
Search
Fix data
Co-authors
- Jiahui Geng 1
- Iryna Gurevych 1
- Hasan Iqbal 1
- Preslav Nakov 1
- Yuxia Wang 1
- show all...