Kai Mei


2024

pdf bib
Learning Autonomous Driving Tasks via Human Feedbacks with Large Language Models
Yunsheng Ma | Xu Cao | Wenqian Ye | Can Cui | Kai Mei | Ziran Wang
Findings of the Association for Computational Linguistics: EMNLP 2024

Traditional autonomous driving systems have mainly focused on making driving decisions without human interaction, overlooking human-like decision-making and human preference required in complex traffic scenarios. To bridge this gap, we introduce a novel framework leveraging Large Language Models (LLMs) for learning human-centered driving decisions from diverse simulation scenarios and environments that incorporate human feedback. Our contributions include a GPT-4-based programming planner that integrates seamlessly with the existing CARLA simulator to understand traffic scenes and react to human instructions. Specifically, we build a human-guided learning pipeline that incorporates human driver feedback directly into the learning process and stores optimal driving programming policy using Retrieval Augmented Generation (RAG). Impressively, our programming planner, with only 50 saved code snippets, can match the performance of baseline extensively trained reinforcement learning (RL) models. Our paper highlights the potential of an LLM-powered shared-autonomy system, pushing the frontier of autonomous driving system development to be more interactive and intuitive.

2023

pdf bib
NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models
Kai Mei | Zheng Li | Zhenting Wang | Yang Zhang | Shiqing Ma
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Prompt-based learning is vulnerable to backdoor attacks. Existing backdoor attacks against prompt-based models consider injecting backdoors into the entire embedding layers or word embedding vectors. Such attacks can be easily affected by retraining on downstream tasks and with different prompting strategies, limiting the transferability of backdoor attacks. In this work, we propose transferable backdoor attacks against prompt-based models, called NOTABLE, which is independent of downstream tasks and prompting strategies. Specifically, NOTABLE injects backdoors into the encoders of PLMs by utilizing an adaptive verbalizer to bind triggers to specific words (i.e., anchors). It activates the backdoor by pasting input with triggers to reach adversary-desired anchors, achieving independence from downstream tasks and prompting strategies. We conduct experiments on six NLP tasks, three popular models, and three prompting strategies. Empirical results show that NOTABLE achieves superior attack performance (i.e., attack success rate over 90% on all the datasets), and outperforms two state-of-the-art baselines. Evaluations on three defenses show the robustness of NOTABLE. Our code can be found at https://github.com/RU-System-Software-and-Security/Notable.