Pengcheng Wang

2024

The increasing scale of large language models (LLMs) brings emergent abilities to various complex tasks requiring reasoning, such as arithmetic and commonsense reasoning. It is known that the effective design of task-specific prompts is critical for LLMs’ ability to produce high-quality answers. In particular, an effective approach for complex question-and-answering tasks is example-based prompting with chain-of-thought (CoT) reasoning, which significantly improves the performance of LLMs. However, current CoT methods rely on a fixed set of human-annotated exemplars, which are not necessarily the most effective examples for different tasks. This paper proposes a new method, Active-Prompt, to adapt LLMs to different tasks with task-specific example prompts (annotated with human-designed CoT reasoning). For this purpose, we propose a solution to the key problem of determining which questions are the most important and helpful to annotate from a pool of task-specific queries. By borrowing ideas from the related problem of uncertainty-based active learning, we introduce several metrics to characterize the uncertainty so as to select the most uncertain questions for annotation. Experimental results demonstrate the superiority of our proposed method, achieving superior performance on eight complex reasoning tasks. Further analyses of different uncertainty metrics, pool sizes, zero-shot learning, and accuracy-uncertainty relationships demonstrate the effectiveness of our method.

pdf bib abs

Table data is pervasive in various industries, and its comprehension and manipulation demand significant time and effort for users seeking to extract relevant information. Consequently, an increasing number of studies have been directed towards table-to-text generation tasks. However, most existing methods are benchmarked solely on a limited number of datasets with varying configurations, leading to a lack of unified, standardized, fair, and comprehensive comparison between methods. This paper presents OpenT2T, the first open-source toolkit for table-to-text generation, designed to reproduce existing large language models (LLMs) for performance comparison and expedite the development of new models.We have implemented and compared a wide range of LLMs under zero- and few-shot settings on 9 table-to-text generation datasets, covering data insight generation, table summarization, and free-form table question answering. Additionally, we maintain a public leaderboard to provide insights for future work into how to choose appropriate table-to-text generation systems for real-world scenarios.

pdf bib abs

Towards Robust Evidence-Aware Fake News Detection via Improving Semantic Perception
Yike Wu | Yang Xiao | Mengting Hu | Mengying Liu | Pengcheng Wang | Mingming Liu
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Evidence-aware fake news detection aims to determine the veracity of a given news (i.e., claim) with external evidences. We find that existing methods lack sufficient semantic perception and are easily blinded by textual expressions. For example, they still make the same prediction after we flip the semantics of a claim, which makes them vulnerable to malicious attacks. In this paper, we propose a model-agnostic training framework to improve the semantic perception of evidence-aware fake news detection. Specifically, we first introduce two kinds of data augmentation to complement the original training set with synthetic data. The semantic-flipped augmentation synthesizes claims with similar textual expressions but opposite semantics, while the semantic-invariant augmentation synthesizes claims with the same semantics but different writing styles. Moreover, we design a novel module to learn better claim representation which is more sensitive to the semantics, and further incorporate it into a multi-objective optimization paradigm. In the experiments, we also extend the original test set of benchmark datasets with the synthetic data to better evaluate the model perception of semantics. Experimental results demonstrate that our approach significantly outperforms the state-of-the-art methods on the extended test set, while achieving competitive performance on the original one. Our source code are released at https://github.com/Xyang1998/RobustFND.