Caroline Vannebo


2025

Automated fact-checking (AFC) of factual claims require efficiency and accuracy. Existing evaluation frameworks like Ev2R achieve strong semantic grounding but incur substantial computational cost, while simpler metrics based on overlap or one-to-one matching often misalign with human judgments. In this paper, we introduce SemQA, a lightweight and accurate evidence-scoring metric that combines transformer-based question scoring with bidirectional NLI entailment on answers. We evaluate SemQA by conducting human evaluations, analyzing correlations with existing metrics, and examining representative examples.