Mubashara Akhtar


2024

pdf bib
ChartCheck: Explainable Fact-Checking over Real-World Chart Images
Mubashara Akhtar | Nikesh Subedi | Vivek Gupta | Sahar Tahmasebi | Oana Cocarascu | Elena Simperl
Findings of the Association for Computational Linguistics: ACL 2024

Whilst fact verification has attracted substantial interest in the natural language processing community, verifying misinforming statements against data visualizations such as charts has so far been overlooked. Charts are commonly used in the real-world to summarize and com municate key information, but they can also be easily misused to spread misinformation and promote certain agendas. In this paper, we introduce ChartCheck, a novel, large-scale dataset for explainable fact-checking against real-world charts, consisting of 1.7k charts and 10.5k human-written claims and explanations. We systematically evaluate ChartCheck using vision-language and chart-to-table models, and propose a baseline to the community. Finally, we study chart reasoning types and visual attributes that pose a challenge to these models.

pdf bib
Proceedings of the Seventh Fact Extraction and VERification Workshop (FEVER)
Michael Schlichtkrull | Yulong Chen | Chenxi Whitehouse | Zhenyun Deng | Mubashara Akhtar | Rami Aly | Zhijiang Guo | Christos Christodoulopoulos | Oana Cocarascu | Arpit Mittal | James Thorne | Andreas Vlachos
Proceedings of the Seventh Fact Extraction and VERification Workshop (FEVER)

pdf bib
The Automated Verification of Textual Claims (AVeriTeC) Shared Task
Michael Schlichtkrull | Yulong Chen | Chenxi Whitehouse | Zhenyun Deng | Mubashara Akhtar | Rami Aly | Zhijiang Guo | Christos Christodoulopoulos | Oana Cocarascu | Arpit Mittal | James Thorne | Andreas Vlachos
Proceedings of the Seventh Fact Extraction and VERification Workshop (FEVER)

The Automated Verification of Textual Claims (AVeriTeC) shared task asks participants to retrieve evidence and predict veracity for real-world claims checked by fact-checkers. Evidence can be found either via a search engine, or via a knowledge store provided by the organisers. Submissions are evaluated using the AVeriTeC score, which considers a claim to be accurately verified if and only if both the verdict is correct and retrieved evidence is considered to meet a certain quality threshold. The shared task received 21 submissions, 18 of which surpassed our baseline. The winning team was TUDA_MAI with an AVeriTeC score of 63%. In this paper we describe the shared task, present the full results, and highlight key takeaways from the shared task.

2023

pdf bib
Reading and Reasoning over Chart Images for Evidence-based Automated Fact-Checking
Mubashara Akhtar | Oana Cocarascu | Elena Simperl
Findings of the Association for Computational Linguistics: EACL 2023

Evidence data for automated fact-checking (AFC) can be in multiple modalities such as text, tables, images, audio, or video. While there is increasing interest in using images for AFC, previous works mostly focus on detecting manipulated or fake images. We propose a novel task, chart-based fact-checking, and introduce ChartBERT as the first model for AFC against chart evidence. ChartBERT leverages textual, structural and visual information of charts to determine the veracity of textual claims. For evaluation, we create ChartFC, a new dataset of 15,886 charts. We systematically evaluate 75 different vision-language (VL) baselines and show that ChartBERT outperforms VL models, achieving 63.8% accuracy. Our results suggest that the task is complex yet feasible, with many challenges ahead.

pdf bib
Multimodal Automated Fact-Checking: A Survey
Mubashara Akhtar | Michael Schlichtkrull | Zhijiang Guo | Oana Cocarascu | Elena Simperl | Andreas Vlachos
Findings of the Association for Computational Linguistics: EMNLP 2023

Misinformation is often conveyed in multiple modalities, e.g. a miscaptioned image. Multimodal misinformation is perceived as more credible by humans, and spreads faster than its text-only counterparts. While an increasing body of research investigates automated fact-checking (AFC), previous surveys mostly focus on text. In this survey, we conceptualise a framework for AFC including subtasks unique to multimodal misinformation. Furthermore, we discuss related terms used in different communities and map them to our framework. We focus on four modalities prevalent in real-world fact-checking: text, image, audio, and video. We survey benchmarks and models, and discuss limitations and promising directions for future research

pdf bib
Exploring the Numerical Reasoning Capabilities of Language Models: A Comprehensive Analysis on Tabular Data
Mubashara Akhtar | Abhilash Shankarampeta | Vivek Gupta | Arpit Patil | Oana Cocarascu | Elena Simperl
Findings of the Association for Computational Linguistics: EMNLP 2023

Numerical data plays a crucial role in various real-world domains like finance, economics, and science. Thus, understanding and reasoning with numbers are essential in these fields. Recent benchmarks have assessed the numerical reasoning abilities of language models, revealing their limitations in limited and specific numerical aspects. In this paper, we propose a complete hierarchical taxonomy for numerical reasoning skills, encompassing over ten reasoning types across four levels: representation, number sense, manipulation, and complex reasoning. We conduct a comprehensive evaluation of state-of-the-art models on all reasoning types. To identify challenging reasoning types for different model types, we develop a diverse and extensive set of numerical probes and measure performance shifts. By employing a semi-automated approach, we focus on the tabular Natural Language Inference (TNLI) task as a case study. While no single model excels in all reasoning types, FlanT5 (few-/zero-shot) and GPT3.5 (few-shot) demonstrate strong overall numerical reasoning skills compared to other models in our probes.

pdf bib
Proceedings of the Sixth Fact Extraction and VERification Workshop (FEVER)
Mubashara Akhtar | Rami Aly | Christos Christodoulopoulos | Oana Cocarascu | Zhijiang Guo | Arpit Mittal | Michael Schlichtkrull | James Thorne | Andreas Vlachos
Proceedings of the Sixth Fact Extraction and VERification Workshop (FEVER)

2022

pdf bib
PubHealthTab: A Public Health Table-based Dataset for Evidence-based Fact Checking
Mubashara Akhtar | Oana Cocarascu | Elena Simperl
Findings of the Association for Computational Linguistics: NAACL 2022

Inspired by human fact checkers, who use different types of evidence (e.g. tables, images, audio) in addition to text, several datasets with tabular evidence data have been released in recent years. Whilst the datasets encourage research on table fact-checking, they rely on information from restricted data sources, such as Wikipedia for creating claims and extracting evidence data, making the fact-checking process different from the real-world process used by fact checkers. In this paper, we introduce PubHealthTab, a table fact-checking dataset based on real world public health claims and noisy evidence tables from sources similar to those used by real fact checkers. We outline our approach for collecting evidence data from various websites and present an in-depth analysis of our dataset. Finally, we evaluate state-of-the-art table representation and pre-trained models fine-tuned on our dataset, achieving an overall F1 score of 0.73.