AraFacts: The First Large Arabic Dataset of Naturally Occurring Claims
Zien Sheikh Ali | Watheq Mansour | Tamer Elsayed | Abdulaziz Al‐Ali
Proceedings of the Sixth Arabic Natural Language Processing Workshop
We introduce AraFacts, the first large Arabic dataset of naturally occurring claims collected from 5 Arabic fact-checking websites, e.g., Fatabyyano and Misbar, and covering claims since 2016. Our dataset consists of 6,121 claims along with their factual labels and additional metadata, such as fact-checking article content, topical category, and links to posts or Web pages spreading the claim. Since the data is obtained from various fact-checking websites, we standardize the original claim labels to provide a unified label rating for all claims. Moreover, we provide revealing dataset statistics and motivate its use by suggesting possible research applications. The dataset is made publicly available for the research community.