Automated Claim Detection for Fact-checking: A Case Study using Norwegian Pre-trained Language Models

Ghazaal Sheikhi, Samia Touileb, Sohail Khan


Abstract
We investigate to what extent pre-trained language models can be used for automated claim detection for fact-checking in a low resource setting. We explore this idea by fine-tuning four Norwegian pre-trained language models to perform the binary classification task of determining if a claim should be discarded or upheld to be further processed by human fact-checkers. We conduct a set of experiments to compare the performance of the language models, and provide a simple baseline model using SVM with tf-idf features. Since we are focusing on claim detection, the recall score for the upheld class is to be emphasized over other performance measures. Our experiments indicate that the language models are superior to the baseline system in terms of F1, while the baseline model results in the highest precision. However, the two Norwegian models, NorBERT2 and NB-BERT_large, give respectively superior F1 and recall values. We argue that large language models could be successfully employed to solve the automated claim detection problem. The choice of the model depends on the desired end-goal. Moreover, our error analysis shows that language models are generally less sensitive to the changes in claim length and source than the SVM model.
Anthology ID:
2023.nodalida-1.1
Volume:
Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
Month:
May
Year:
2023
Address:
Tórshavn, Faroe Islands
Editors:
Tanel Alumäe, Mark Fishel
Venue:
NoDaLiDa
SIG:
Publisher:
University of Tartu Library
Note:
Pages:
1–9
Language:
URL:
https://aclanthology.org/2023.nodalida-1.1
DOI:
Bibkey:
Cite (ACL):
Ghazaal Sheikhi, Samia Touileb, and Sohail Khan. 2023. Automated Claim Detection for Fact-checking: A Case Study using Norwegian Pre-trained Language Models. In Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), pages 1–9, Tórshavn, Faroe Islands. University of Tartu Library.
Cite (Informal):
Automated Claim Detection for Fact-checking: A Case Study using Norwegian Pre-trained Language Models (Sheikhi et al., NoDaLiDa 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.nodalida-1.1.pdf