Quan Z. Sheng

2025

In an era where misinformation spreads freely, fact-checking (FC) plays a crucial role in verifying claims and promoting reliable information. While automated fact-checking (AFC) has advanced significantly, existing systems remain vulnerable to adversarial attacks that manipulate or generate claims, evidence, or claim-evidence pairs. These attacks can distort the truth, mislead decision-makers, and ultimately undermine the reliability of FC models. Despite growing research interest in adversarial attacks against AFC systems, a comprehensive, holistic overview of key challenges remains lacking. These challenges include understanding attack strategies, assessing the resilience of current models, and identifying ways to enhance robustness. This survey provides the first in-depth review of adversarial attacks targeting FC, categorizing existing attack methodologies and evaluating their impact on AFC systems. Additionally, we examine recent advancements in adversary-aware defenses and highlight open research questions that require further exploration. Our findings underscore the urgent need for resilient FC frameworks capable of withstanding adversarial manipulations in pursuit of preserving high verification accuracy.

Distractor generation is the task of automatically generating plausible yet incorrect options (i.e., distractors) for fill-in-the-blank and multiple-choice questions. In assessment, distractors must be contextually relevant to the given question and answer. Even though recent research works focus on fine-tuning pre-trained encoder-decoder models with data augmentation techniques to generate distractors, these models often fail to capture the full semantic representation of a given question-answer and related distractors. The augmentation methods often rely on expanding the quantity of proposed candidates (i.e., questions or distractors), which can introduce noise into the models without necessarily enhancing their understanding of the deeper semantic relationships between question-answer and related distractors. This paper introduces a novel distractor generation model based on contrastive learning to train the model to recognize essential semantic features necessary to generate in-context distractors. The extensive experiments on two public datasets indicate that contrastive learning introduces a strong baseline model to the distractor generation task. It significantly outperforms recent models, increasing the NDCG@3 score from 24.68 to 32.33 on the MCQ dataset and from 26.66 to 36.68 on the SciQ dataset.

2024

pdf bib abs
Distractor Generation in Multiple-Choice Tasks: A Survey of Methods, Datasets, and Evaluation
Elaf Alhazmi | Quan Z. Sheng | Wei Emma Zhang | Munazza Zaib | Ahoud Alhazmi
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

The distractor generation task focuses on generating incorrect but plausible options for objective questions such as fill-in-the-blank and multiple-choice questions. This task is widely utilized in educational settings across various domains and subjects. The effectiveness of these questions in assessments relies on the quality of the distractors, as they challenge examinees to select the correct answer from a set of misleading options. The evolution of artificial intelligence (AI) has transitioned the task from traditional methods to the use of neural networks and pre-trained language models. This shift has established new benchmarks and expanded the use of advanced deep learning methods in generating distractors. This survey explores distractor generation tasks, datasets, methods, and current evaluation metrics for English objective questions, covering both text-based and multi-modal domains. It also evaluates existing AI models and benchmarks and discusses potential future research directions.

2022

Most of the state-of-the-art methods for abstractive text summarization are under supervised learning settings, while heavily relying on high-quality and large-scale parallel corpora. In this paper, we remove the need for reference summaries and present an unsupervised learning method SCR (Summarize, Contrast and Review) for abstractive summarization, which leverages contrastive learning and is the first work to apply contrastive learning for unsupervised abstractive summarization. Particularly, we use the true source documents as positive source document examples, and strategically generated fake source documents as negative source document examples to train the model to generate good summaries. Furthermore, we consider and improve the writing quality of the generated summaries by guiding them to be similar to human-written texts. The promising results on extensive experiments show that SCR outperforms other unsupervised abstractive summarization baselines, which demonstrates its effectiveness.

Co-authors

Mohammed I. Thanoon 1

Jia Wu 1

Venues

emnlp2
findings2

Fix author