2024
pdf
bib
abs
Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better
Shengchao Liu
|
Xiaoming Liu
|
Yichen Wang
|
Zehua Cheng
|
Chengzhengxu Li
|
Zhaohan Zhang
|
Yu Lan
|
Chao Shen
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The burgeoning generative capabilities of large language models (LLMs) have raised growing concerns about abuse, demanding automatic machine-generated text detectors. DetectGPT, a zero-shot metric-based detector, first introduces perturbation and shows great performance improvement. However, in DetectGPT, the random perturbation strategy could introduce noise, and logit regression depends on the threshold, harming the generalizability and applicability of individual or small-batch inputs. Hence, we propose a novel fine-tuned detector, PECOLA, bridging metric-based and fine-tuned methods by contrastive learning on selective perturbation. Selective strategy retains important tokens during perturbation and weights for multi-pair contrastive learning. The experiments show that PECOLA outperforms the state-of-the-art (SOTA) by 1.20% in accuracy on average on four public datasets. And we further analyze the effectiveness, robustness, and generalization of the method.
pdf
bib
abs
Stumbling Blocks: Stress Testing the Robustness of Machine-Generated Text Detectors Under Attacks
Yichen Wang
|
Shangbin Feng
|
Abe Hou
|
Xiao Pu
|
Chao Shen
|
Xiaoming Liu
|
Yulia Tsvetkov
|
Tianxing He
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The widespread use of large language models (LLMs) is increasing the demand for methods that detect machine-generated text to prevent misuse. The goal of our study is to stress test the detectors’ robustness to malicious attacks under realistic scenarios. We comprehensively study the robustness of popular machine-generated text detectors under attacks from diverse categories: editing, paraphrasing, co-generating, and prompting. Our attacks assume limited access to the generator LLMs, and we compare the performance of detectors on different attacks under different budget levels. Our experiments reveal that almost none of the existing detectors remain robust under all the attacks, and all detectors exhibit different loopholes. Averaging all detectors, the performance drops by 35% across all attacks. Further, we investigate the reasons behind these defects and propose initial out-of-the-box patches.
pdf
bib
abs
Large Language Models Can Self-Correct with Key Condition Verification
Zhenyu Wu
|
Qingkai Zeng
|
Zhihan Zhang
|
Zhaoxuan Tan
|
Chao Shen
|
Meng Jiang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Intrinsic self-correct was a method that instructed large language models (LLMs) to verify and correct their responses without external feedback. Unfortunately, the study concluded that the LLMs could not self-correct reasoning yet. We find that a simple yet effective prompting method enhances LLM performance in identifying and correcting inaccurate answers without external feedback.That is to mask a key condition in the question, add the current response to construct a verification question, and predict the condition to verify the response. The condition can be an entity in an open-domain question or a numerical value in an arithmetic question, which requires minimal effort (via prompting) to identify. We propose an iterative verify-then-correct framework to progressively identify and correct (probably) false responses, named ProCo. We conduct experiments on three reasoning tasks. On average, ProCo, with GPT-3.5-Turbo-1106 as the backend LLM, yields +6.8 exact match on four open-domain question answering datasets, +14.1 accuracy on three arithmetic reasoning datasets, and +9.6 accuracy on a commonsense reasoning dataset, compared to Self-Correct.Our implementation is made publicly available at https://wzy6642.github.io/proco.github.io/.
pdf
bib
abs
StablePT : Towards Stable Prompting for Few-shot Learning via Input Separation
Xiaoming Liu
|
Chen Liu
|
Zhaohan Zhang
|
Chengzhengxu Li
|
Longtian Wang
|
Yu Lan
|
Chao Shen
Findings of the Association for Computational Linguistics: EMNLP 2024
Large language models have shown their ability to become effective few-shot learners with prompting, revoluting the paradigm of learning with data scarcity. However, this approach largely depends on the quality of prompt initialization and always exhibits large variability among different runs. Such property makes prompt tuning highly unreliable and vulnerable to poorly constructed prompts, which limits its extension to more real-world applications. To tackle this issue, we propose to treat the hard prompt and soft prompt as separate inputs to mitigate noise brought by the prompt initialization. Furthermore, we optimize soft prompts with contrastive learning for utilizing class-aware information in the training process to maintain model performance. Experimental results demonstrate that StablePT outperforms state-of-the-art methods by 6.97% in accuracy and reduces the standard deviation by 1.92 on average. Furthermore, extensive experiments underscore its robustness and stability across 8 datasets covering various tasks.
pdf
bib
abs
Instructing Large Language Models to Identify and Ignore Irrelevant Conditions
Zhenyu Wu
|
Chao Shen
|
Meng Jiang
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Math word problem (MWP) solving requires generating a reasoning path based on a given problem description that often contains irrelevant conditions.Existing chain-of-thought (CoT) prompting methods elicited multi-step reasoning abilities of large language models (LLMs) to solve MWPs.However, they were seriously confused by the irrelevant conditions, resulting in low accuracy.In this paper, we propose a novel approach named I3C that instructs LLMs to identify and ignore irrelevant conditions.It identifies a set of irrelevant condition candidates that have a weak semantic relevance with the question.Then it prompts LLMs to verify the irrelevant conditions.Lastly it instructs the LLMs with the verification on relevant and irrelevant conditions to avoid confusion and improve reasoning paths.Moreover, we propose to select (problem, reasoning paths) pairs as demonstrations to enhance I3C with few-shot reasoning. We develop I3C-Select that selects the most confusing problems based on the semantic relevance measurement.We conduct extensive experiments on eight MWP datasets.I3C can be combined with any CoT prompting methods to improve the performance of solving MWPs.Notably, with GPT-3.5-Turbo and I3C-Select, we achieve an accuracy of 96.0 and 94.1 on GSM-IC2-1K and GSM-ICM-1K, respectively, significantly outperforming the state-of-the-art few-shot prompting method Complex-CoT by +11.7 and +11.1.Our implementation is made publicly available at https://wzy6642.github.io/I3C.github.io/.
2023
pdf
bib
abs
CoCo: Coherence-Enhanced Machine-Generated Text Detection Under Low Resource With Contrastive Learning
Xiaoming Liu
|
Zhaohan Zhang
|
Yichen Wang
|
Hang Pu
|
Yu Lan
|
Chao Shen
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Machine-Generated Text (MGT) detection, a task that discriminates MGT from Human-Written Text (HWT), plays a crucial role in preventing misuse of text generative models, which excel in mimicking human writing style recently. Latest proposed detectors usually take coarse text sequences as input and fine-tune pretrained models with standard cross-entropy loss. However, these methods fail to consider the linguistic structure of texts. Moreover, they lack the ability to handle the low-resource problem which could often happen in practice considering the enormous amount of textual data online. In this paper, we present a coherence-based contrastive learning model named CoCo to detect the possible MGT under low-resource scenario. To exploit the linguistic feature, we encode coherence information in form of graph into text representation. To tackle the challenges of low data resource, we employ a contrastive learning framework and propose an improved contrastive loss for preventing performance degradation brought by simple samples. The experiment results on two public datasets and two self-constructed datasets prove our approach outperforms the state-of-art methods significantly. Also, we surprisingly find that MGTs originated from up-to-date language models could be easier to detect than these from previous models, in our experiments. And we propose some preliminary explanations for this counter-intuitive phenomena. All the codes and datasets are open-sourced.
2013
pdf
bib
A Participant-based Approach for Event Summarization Using Twitter Streams
Chao Shen
|
Fei Liu
|
Fuliang Weng
|
Tao Li
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
2011
pdf
bib
A Non-negative Matrix Factorization Based Approach for Active Dual Supervision from Document and Word Labels
Chao Shen
|
Tao Li
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing
2010
pdf
bib
Multi-Document Summarization via the Minimum Dominating Set
Chao Shen
|
Tao Li
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)