Ryan Sun
2024
SpecHub: Provable Acceleration to Multi-Draft Speculative Decoding
Ryan Sun
|
Tianyi Zhou
|
Xun Chen
|
Lichao Sun
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Large Language Models (LLMs) have become essential in advancing natural language processing (NLP) tasks, but their sequential token generation limits inference speed. Multi-Draft Speculative Decoding (MDSD) offers a promising solution by using a smaller draft model to generate multiple token sequences, which the target LLM verifies in parallel.However, current heuristic approaches, such as Recursive Rejection Sampling (RRS), suffer from low acceptance rates in subsequent drafts, limiting the advantages of using multiple drafts. Meanwhile, Optimal Transport with Membership Cost (OTM) can theoretically improve acceptance rates, but its computational cost is too high for real-time use.We present SpecHub, a novel, efficient sampling-verification method for MDSD that improves acceptance rates with only linear computational overhead. By simplifying the OTM problem into a compact Linear Programming model, SpecHub significantly reduces computational complexity. It further accelerates sampling by leveraging a sparse joint distribution, focusing computation on high-probability token sequences.%It integrates seamlessly into existing MDSD frameworks.In extensive experiments, Spechub consistently generates 0.05-0.27 and 0.02-0.16 more tokens per step than RRS and RRS without replacement. We attach our code at https://github.com/MasterGodzilla/Speculative_decoding_OT.
Virtual Context Enhancing Jailbreak Attacks with Special Token Injection
Yuqi Zhou
|
Lin Lu
|
Ryan Sun
|
Pan Zhou
|
Lichao Sun
Findings of the Association for Computational Linguistics: EMNLP 2024
Jailbreak attacks on large language models (LLMs) involve inducing these models to generate harmful content that violates ethics or laws, posing a significant threat to LLM security. Current jailbreak attacks face two main challenges: low success rates due to defensive measures and high resource requirements for crafting specific prompts. This paper introduces Virtual Context, which leverages special tokens, previously overlooked in LLM security, to improve jailbreak attacks. Virtual Context addresses these challenges by significantly increasing the success rates of existing jailbreak methods and requiring minimal background knowledge about the target model, thus enhancing effectiveness in black-box settings without additional overhead. Comprehensive evaluations show that Virtual Context-assisted jailbreak attacks can improve the success rates of four widely used jailbreak methods by approximately 40% across various LLMs. Additionally, applying Virtual Context to original malicious behaviors still achieves a notable jailbreak effect. In summary, our research highlights the potential of special tokens in jailbreak attacks and recommends including this threat in red-teaming testing to comprehensively enhance LLM security.
Search
Co-authors
- Lichao Sun 2
- Tianyi Zhou 1
- Xun Chen 1
- Yuqi Zhou 1
- Lin Lu 1
- show all...
- Pan Zhou 1