2024
pdf
bib
abs
Large Language Models Can Be Contextual Privacy Protection Learners
Yijia Xiao
|
Yiqiao Jin
|
Yushi Bai
|
Yue Wu
|
Xianjun Yang
|
Xiao Luo
|
Wenchao Yu
|
Xujiang Zhao
|
Yanchi Liu
|
Quanquan Gu
|
Haifeng Chen
|
Wei Wang
|
Wei Cheng
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
The proliferation of Large Language Models (LLMs) has driven considerable interest in fine-tuning them with domain-specific data to create specialized language models. Nevertheless, such domain-specific fine-tuning data often contains contextually sensitive personally identifiable information (PII). Direct fine-tuning LLMs on this data without privacy protection poses a risk of data leakage of sensitive PII during inference time. To address this challenge, we introduce Contextual Privacy Protection Language Models (CPPLM), a novel paradigm for fine-tuning LLMs that effectively injects domain-specific knowledge while safeguarding inference-time data privacy. Our work offers a theoretical analysis for model design and delves into various techniques such as corpus curation, penalty-based unlikelihood in training loss, and instruction-based tuning, etc. Extensive experiments across diverse datasets and scenarios demonstrate the effectiveness of our approaches. In particular, instruction tuning with both positive and negative examples, stands out as a promising method, effectively protecting private data while enhancing the model’s knowledge. Our work underscores the potential for Large Language Models as robust contextual privacy protection learners.
pdf
bib
abs
MultiAgent Collaboration Attack: Investigating Adversarial Attacks in Large Language Model Collaborations via Debate
Alfonso Amayuelas
|
Xianjun Yang
|
Antonis Antoniades
|
Wenyue Hua
|
Liangming Pan
|
William Yang Wang
Findings of the Association for Computational Linguistics: EMNLP 2024
Large Language Models (LLMs) have shown exceptional results on current benchmarks when working individually. The advancement in their capabilities, along with a reduction in parameter size and inference times, has facilitated the use of these models as agents, enabling interactions among multiple models to execute complex tasks. Such collaborations offer several advantages, including the use of specialized models (e.g. coding), improved confidence through multiple computations, and enhanced divergent thinking, leading to more diverse outputs. Thus, the collaborative use of language models is expected to grow significantly in the coming years. In this work, we evaluate the behavior of a network of models collaborating through debate under the influence of an adversary. We introduce pertinent metrics to assess the adversary’s effectiveness, focusing on system accuracy and model agreement. Our findings highlight the importance of a model’s persuasive ability in influencing others. Additionally, we explore inference-time methods to generate more compelling arguments and evaluate the potential of prompt-based mitigation as a defensive strategy.
pdf
bib
abs
A Survey on Detection of LLMs-Generated Content
Xianjun Yang
|
Liangming Pan
|
Xuandong Zhao
|
Haifeng Chen
|
Linda Ruth Petzold
|
William Yang Wang
|
Wei Cheng
Findings of the Association for Computational Linguistics: EMNLP 2024
The burgeoning capabilities of advanced large language models (LLMs) such as ChatGPT have led to an increase in synthetic content generation with implications across a variety of sectors, including media, cybersecurity, public discourse, and education. As such, the ability to detect LLMs-generated content has become of paramount importance. We aim to provide a detailed overview of existing detection strategies and benchmarks, scrutinizing their differences and identifying key challenges and prospects in the field, advocating for more adaptable and robust models to enhance detection accuracy. We also posit the necessity for a multi-faceted approach to defend against various attacks to counter the rapidly advancing capabilities of LLMs. To the best of our knowledge, this work is the first comprehensive survey on the detection in the era of LLMs. We hope it will provide a broad understanding of the current landscape of LLMs-generated content detection, and we have maintained a website to consistently update the latest research as a guiding reference for researchers and practitioners.
pdf
bib
abs
TrustAgent: Towards Safe and Trustworthy LLM-based Agents
Wenyue Hua
|
Xianjun Yang
|
Mingyu Jin
|
Zelong Li
|
Wei Cheng
|
Ruixiang Tang
|
Yongfeng Zhang
Findings of the Association for Computational Linguistics: EMNLP 2024
The rise of LLM-based agents shows great potential to revolutionize task planning, capturing significant attention. Given that these agents will be integrated into high-stake domains, ensuring their reliability and safety is crucial. This paper presents an Agent-Constitution-based agent framework, TrustAgent, with a particular focus on improving the LLM-based agent safety. The proposed framework ensures strict adherence to the Agent Constitution through three strategic components: pre-planning strategy which injects safety knowledge to the model before plan generation, in-planning strategy which enhances safety during plan generation, and post-planning strategy which ensures safety by post-planning inspection. Our experimental results demonstrate that the proposed framework can effectively enhance an LLM agent’s safety across multiple domains by identifying and mitigating potential dangers during the planning. Further analysis reveals that the framework not only improves safety but also enhances the helpfulness of the agent. Additionally, we highlight the importance of the LLM reasoning ability in adhering to the Constitution. This paper sheds light on how to ensure the safe integration of LLM-based agents into human-centric environments. Data and code are available at
https://anonymous.4open.science/r/TrustAgent-06DC.
pdf
bib
abs
Navigating the OverKill in Large Language Models
Chenyu Shi
|
Xiao Wang
|
Qiming Ge
|
Songyang Gao
|
Xianjun Yang
|
Tao Gui
|
Qi Zhang
|
Xuanjing Huang
|
Xun Zhao
|
Dahua Lin
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models are meticulously aligned to be both helpful and harmless. However, recent research points to a potential overkill which means models may refuse to answer benign queries. In this paper, we investigate the factors for overkill by exploring how models handle and determine the safety of queries. Our findings reveal the presence of shortcuts within models, leading to excessive attention to harmful words like ‘kill’ and prompts emphasizing safety will exacerbate overkill. Based on these insights, we introduce Self-Contrastive Decoding (Self-CD), a training-free and model-agnostic strategy, to alleviate this phenomenon. We first extract such excessive attention by amplifying the difference in the model’s output distributions when responding to system prompts that either include or omit an emphasis on safety. Then we determine the final next-token predictions by downplaying the excessive attention via contrastive decoding. Empirical results have indicated that our method has achieved an average reduction of the refusal rate by 20 % while having almost no impact on safety.
2023
pdf
bib
abs
Few-Shot Document-Level Event Argument Extraction
Xianjun Yang
|
Yujie Lu
|
Linda Petzold
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Event argument extraction (EAE) has been well studied at the sentence level but under-explored at the document level. In this paper, we study to capture event arguments that actually spread across sentences in documents. Prior works usually assume full access to rich document supervision, ignoring the fact that the available argument annotation is limited in production. To fill this gap, we present FewDocAE, a Few-Shot Document-Level Event Argument Extraction benchmark, based on the existing document-level event extraction dataset. We first define the new problem and reconstruct the corpus by a novel N-Way-D-Doc sampling instead of the traditional N-Way-K-Shot strategy. Then we adjust the current document-level neural models into the few-shot setting to provide baseline results under in- and cross-domain settings. Since the argument extraction depends on the context from multiple sentences and the learning process is limited to very few examples, we find this novel task to be very challenging with substantively low performance. Considering FewDocAE is closely related to practical use under low-resource regimes, we hope this benchmark encourages more research in this direction. Our data and codes will be available online.
pdf
bib
abs
OASum: Large-Scale Open Domain Aspect-based Summarization
Xianjun Yang
|
Kaiqiang Song
|
Sangwoo Cho
|
Xiaoyang Wang
|
Xiaoman Pan
|
Linda Petzold
|
Dong Yu
Findings of the Association for Computational Linguistics: ACL 2023
Aspect or query-based summarization has recently caught more attention, as it can generate differentiated summaries based on users’ interests. However, the current dataset for aspect or query-based summarization either focuses on specific domains, on a relatively small scale, or contains only a few aspect types. Such limitations hinder further explorations in this direction. In this work, we take advantage of crowd-sourcing knowledge on Wikipedia and automatically create a high-quality, large-scale open-domain aspect-based summarization dataset named OASum, which contains more than 3.7 million instances with around 1 million different aspects on 2 million Wikipedia pages. We provide benchmark results on OASum and demonstrate its ability for diverse aspect-based summarization generation. To overcome the data scarcity problem on specific domains, we also perform zero-shot, few-shot, and fine-tuning on seven downstream datasets. Specifically, zero/few-shot and fine-tuning results show that the model pre-trained on our corpus demonstrates a strong aspect or query-focused generation ability compared with the backbone model. Our dataset and pre-trained checkpoints are publicly available.
2022
pdf
bib
abs
PcMSP: A Dataset for Scientific Action Graphs Extraction from Polycrystalline Materials Synthesis Procedure Text
Xianjun Yang
|
Ya Zhuo
|
Julia Zuo
|
Xinlu Zhang
|
Stephen Wilson
|
Linda Petzold
Findings of the Association for Computational Linguistics: EMNLP 2022
Scientific action graphs extraction from materials synthesis procedures is important for reproducible research, machine automation, and material prediction. But the lack of annotated data has hindered progress in this field. We demonstrate an effort to annotate Polycrystalline Materials Synthesis Procedures PcMSP from 305 open access scientific articles for the construction of synthesis action graphs. This is a new dataset for material science information extraction that simultaneously contains the synthesis sentences extracted from the experimental paragraphs, as well as the entity mentions and intra-sentence relations. A two-step human annotation and inter-annotator agreement study guarantee the high quality of the PcMSP corpus. We introduce four natural language processing tasks: sentence classification, named entity recognition, relation classification, and joint extraction of entities and relations. Comprehensive experiments validate the effectiveness of several state-of-the-art models for these challenges while leaving large space for improvement. We also perform the error analysis and point out some unique challenges that require further investigation. We will release our annotation scheme, the corpus, and codes to the research community to alleviate the scarcity of labeled data in this domain.