WatClaimCheck: A new Dataset for Claim Entailment and Inference

Kashif Khan, Ruizhe Wang, Pascal Poupart


Abstract
We contribute a new dataset for the task of automated fact checking and an evaluation of state of the art algorithms. The dataset includes claims (from speeches, interviews, social media and news articles), review articles published by professional fact checkers and premise articles used by those professional fact checkers to support their review and verify the veracity of the claims. An important challenge in the use of premise articles is the identification of relevant passages that will help to infer the veracity of a claim. We show that transferring a dense passage retrieval model trained with review articles improves the retrieval quality of passages in premise articles. We report results for the prediction of claim veracity by inference from premise articles.
Anthology ID:
2022.acl-long.92
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1293–1304
Language:
URL:
https://aclanthology.org/2022.acl-long.92
DOI:
10.18653/v1/2022.acl-long.92
Bibkey:
Cite (ACL):
Kashif Khan, Ruizhe Wang, and Pascal Poupart. 2022. WatClaimCheck: A new Dataset for Claim Entailment and Inference. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1293–1304, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
WatClaimCheck: A new Dataset for Claim Entailment and Inference (Khan et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.92.pdf
Video:
 https://aclanthology.org/2022.acl-long.92.mp4
Code
 nxii/watclaimcheck
Data
PUBHEALTH