SemQA: Evaluating Evidence with Question Embeddings and Answer Entailment for Fact Verification

Kjetil Indrehus; Caroline Vannebo; Roxana Pop

doi:10.18653/v1/2025.fever-1.14

SemQA: Evaluating Evidence with Question Embeddings and Answer Entailment for Fact Verification

Kjetil Indrehus, Caroline Vannebo, Roxana Pop

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

Automated fact-checking (AFC) of factual claims require efficiency and accuracy. Existing evaluation frameworks like Ev²R achieve strong semantic grounding but incur substantial computational cost, while simpler metrics based on overlap or one-to-one matching often misalign with human judgments. In this paper, we introduce SemQA, a lightweight and accurate evidence-scoring metric that combines transformer-based question scoring with bidirectional NLI entailment on answers. We evaluate SemQA by conducting human evaluations, analyzing correlations with existing metrics, and examining representative examples.

Anthology ID:: 2025.fever-1.14
Volume:: Proceedings of the Eighth Fact Extraction and VERification Workshop (FEVER)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Mubashara Akhtar, Rami Aly, Christos Christodoulopoulos, Oana Cocarascu, Zhijiang Guo, Arpit Mittal, Michael Schlichtkrull, James Thorne, Andreas Vlachos
Venues:: FEVER | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 184–200
Language:
URL:: https://aclanthology.org/2025.fever-1.14/
DOI:: 10.18653/v1/2025.fever-1.14
Bibkey:
Cite (ACL):: Kjetil Indrehus, Caroline Vannebo, and Roxana Pop. 2025. SemQA: Evaluating Evidence with Question Embeddings and Answer Entailment for Fact Verification. In Proceedings of the Eighth Fact Extraction and VERification Workshop (FEVER), pages 184–200, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: SemQA: Evaluating Evidence with Question Embeddings and Answer Entailment for Fact Verification (Indrehus et al., FEVER 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.fever-1.14.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{indrehus-etal-2025-semqa,
    title = "{S}em{QA}: Evaluating Evidence with Question Embeddings and Answer Entailment for Fact Verification",
    author = "Indrehus, Kjetil  and
      Vannebo, Caroline  and
      Pop, Roxana",
    editor = "Akhtar, Mubashara  and
      Aly, Rami  and
      Christodoulopoulos, Christos  and
      Cocarascu, Oana  and
      Guo, Zhijiang  and
      Mittal, Arpit  and
      Schlichtkrull, Michael  and
      Thorne, James  and
      Vlachos, Andreas",
    booktitle = "Proceedings of the Eighth Fact Extraction and VERification Workshop (FEVER)",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.fever-1.14/",
    doi = "10.18653/v1/2025.fever-1.14",
    pages = "184--200",
    ISBN = "978-1-959429-53-1",
    abstract = "Automated fact-checking (AFC) of factual claims require efficiency and accuracy. Existing evaluation frameworks like Ev$^2$R achieve strong semantic grounding but incur substantial computational cost, while simpler metrics based on overlap or one-to-one matching often misalign with human judgments. In this paper, we introduce SemQA, a lightweight and accurate evidence-scoring metric that combines transformer-based question scoring with bidirectional NLI entailment on answers. We evaluate SemQA by conducting human evaluations, analyzing correlations with existing metrics, and examining representative examples."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="indrehus-etal-2025-semqa">
    <titleInfo>
        <title>SemQA: Evaluating Evidence with Question Embeddings and Answer Entailment for Fact Verification</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Kjetil</namePart>
        <namePart type="family">Indrehus</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Caroline</namePart>
        <namePart type="family">Vannebo</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Roxana</namePart>
        <namePart type="family">Pop</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2025-07</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the Eighth Fact Extraction and VERification Workshop (FEVER)</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Mubashara</namePart>
            <namePart type="family">Akhtar</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Rami</namePart>
            <namePart type="family">Aly</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Christos</namePart>
            <namePart type="family">Christodoulopoulos</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Oana</namePart>
            <namePart type="family">Cocarascu</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Zhijiang</namePart>
            <namePart type="family">Guo</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Arpit</namePart>
            <namePart type="family">Mittal</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Michael</namePart>
            <namePart type="family">Schlichtkrull</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">James</namePart>
            <namePart type="family">Thorne</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Andreas</namePart>
            <namePart type="family">Vlachos</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Vienna, Austria</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
        <identifier type="isbn">978-1-959429-53-1</identifier>
    </relatedItem>
    <abstract>Automated fact-checking (AFC) of factual claims require efficiency and accuracy. Existing evaluation frameworks like Ev²R achieve strong semantic grounding but incur substantial computational cost, while simpler metrics based on overlap or one-to-one matching often misalign with human judgments. In this paper, we introduce SemQA, a lightweight and accurate evidence-scoring metric that combines transformer-based question scoring with bidirectional NLI entailment on answers. We evaluate SemQA by conducting human evaluations, analyzing correlations with existing metrics, and examining representative examples.</abstract>
    <identifier type="citekey">indrehus-etal-2025-semqa</identifier>
    <identifier type="doi">10.18653/v1/2025.fever-1.14</identifier>
    <location>
        <url>https://aclanthology.org/2025.fever-1.14/</url>
    </location>
    <part>
        <date>2025-07</date>
        <extent unit="page">
            <start>184</start>
            <end>200</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T SemQA: Evaluating Evidence with Question Embeddings and Answer Entailment for Fact Verification
%A Indrehus, Kjetil
%A Vannebo, Caroline
%A Pop, Roxana
%Y Akhtar, Mubashara
%Y Aly, Rami
%Y Christodoulopoulos, Christos
%Y Cocarascu, Oana
%Y Guo, Zhijiang
%Y Mittal, Arpit
%Y Schlichtkrull, Michael
%Y Thorne, James
%Y Vlachos, Andreas
%S Proceedings of the Eighth Fact Extraction and VERification Workshop (FEVER)
%D 2025
%8 July
%I Association for Computational Linguistics
%C Vienna, Austria
%@ 978-1-959429-53-1
%F indrehus-etal-2025-semqa
%X Automated fact-checking (AFC) of factual claims require efficiency and accuracy. Existing evaluation frameworks like Ev²R achieve strong semantic grounding but incur substantial computational cost, while simpler metrics based on overlap or one-to-one matching often misalign with human judgments. In this paper, we introduce SemQA, a lightweight and accurate evidence-scoring metric that combines transformer-based question scoring with bidirectional NLI entailment on answers. We evaluate SemQA by conducting human evaluations, analyzing correlations with existing metrics, and examining representative examples.
%R 10.18653/v1/2025.fever-1.14
%U https://aclanthology.org/2025.fever-1.14/
%U https://doi.org/10.18653/v1/2025.fever-1.14
%P 184-200

Download as File

Markdown (Informal)

[SemQA: Evaluating Evidence with Question Embeddings and Answer Entailment for Fact Verification](https://aclanthology.org/2025.fever-1.14/) (Indrehus et al., FEVER 2025)

SemQA: Evaluating Evidence with Question Embeddings and Answer Entailment for Fact Verification (Indrehus et al., FEVER 2025)

ACL

Kjetil Indrehus, Caroline Vannebo, and Roxana Pop. 2025. SemQA: Evaluating Evidence with Question Embeddings and Answer Entailment for Fact Verification. In Proceedings of the Eighth Fact Extraction and VERification Workshop (FEVER), pages 184–200, Vienna, Austria. Association for Computational Linguistics.