Evidence Selection as a Token-Level Prediction Task

Dominik Stammbach


Abstract
In Automated Claim Verification, we retrieve evidence from a knowledge base to determine the veracity of a claim. Intuitively, the retrieval of the correct evidence plays a crucial role in this process. Often, evidence selection is tackled as a pairwise sentence classification task, i.e., we train a model to predict for each sentence individually whether it is evidence for a claim. In this work, we fine-tune document level transformers to extract all evidence from a Wikipedia document at once. We show that this approach performs better than a comparable model classifying sentences individually on all relevant evidence selection metrics in FEVER. Our complete pipeline building on this evidence selection procedure produces a new state-of-the-art result on FEVER, a popular claim verification benchmark.
Anthology ID:
2021.fever-1.2
Volume:
Proceedings of the Fourth Workshop on Fact Extraction and VERification (FEVER)
Month:
November
Year:
2021
Address:
Dominican Republic
Venues:
EMNLP | FEVER
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14–20
Language:
URL:
https://aclanthology.org/2021.fever-1.2
DOI:
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.fever-1.2.pdf
Code
 dominiksinsaarland/document-level-fever
Data
FEVER