Z.y. Peng
2024
RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models
Noah Wang
|
Z.y. Peng
|
Haoran Que
|
Jiaheng Liu
|
Wangchunshu Zhou
|
Yuhan Wu
|
Hongcheng Guo
|
Ruitong Gan
|
Zehao Ni
|
Jian Yang
|
Man Zhang
|
Zhaoxiang Zhang
|
Wanli Ouyang
|
Ke Xu
|
Wenhao Huang
|
Jie Fu
|
Junran Peng
Findings of the Association for Computational Linguistics: ACL 2024
The advent of Large Language Models (LLMs) has paved the way for complex tasks such as role-playing, which enhances user interactions by enabling models to imitate various characters. However, the closed-source nature of state-of-the-art LLMs and their general-purpose training limit role-playing optimization. In this paper, we introduce RoleLLM, a framework to benchmark, elicit, and enhance role-playing abilities in LLMs. RoleLLM comprises four stages: (1) Role Profile Construction for 100 roles; (2) Context-Based Instruction Generation (Context-Instruct) for role-specific knowledge extraction; (3) Role Prompting using GPT (RoleGPT) for speaking style imitation; and (4) Role-Conditioned Instruction Tuning (RoCIT) for fine-tuning open-source models along with role customization. By Context-Instruct and RoleGPT, we create RoleBench, the first systematic and fine-grained character-level benchmark dataset for role-playing with 168,093 samples. Moreover, RoCIT on RoleBench yields RoleLLaMA (English) and RoleGLM (Chinese), significantly enhancing role-playing abilities and even achieving comparable results with RoleGPT (using GPT-4).
Search
Co-authors
- Noah Wang 1
- Haoran Que 1
- Jiaheng Liu 1
- Wangchunshu Zhou 1
- Yuhan Wu 1
- show all...