Yongkang Huang
2024
CharacterGLM: Customizing Social Characters with Large Language Models
Jinfeng Zhou
|
Zhuang Chen
|
Dazhen Wan
|
Bosi Wen
|
Yi Song
|
Jifan Yu
|
Yongkang Huang
|
Pei Ke
|
Guanqun Bi
|
Libiao Peng
|
JiaMing Yang
|
Xiyao Xiao
|
Sahand Sabour
|
Xiaohan Zhang
|
Wenjing Hou
|
Yijia Zhang
|
Yuxiao Dong
|
Hongning Wang
|
Jie Tang
|
Minlie Huang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track
Character-based dialogue (CharacterDial) has become essential in the industry (e.g., Character.AI), enabling users to freely customize social characters for social interactions. However, the generalizability and adaptability across various conversational scenarios inherent in customizing social characters still lack public industrial solutions. To address these challenges, by dissecting well-rounded social characters composed of both inherent social profiles and external social behaviors, we manually collect a large-scale Chinese corpus featuring characters with diverse categories and behaviors, and develop CharacterGLM models alongside well-designed refinement methods. Extensive experiments show that CharacterGLM outperforms most popular open- and closed-source LLMs and performs comparably to GPT-4. We will release our data and models for local development and deployment.
SafetyBench: Evaluating the Safety of Large Language Models
Zhexin Zhang
|
Leqi Lei
|
Lindong Wu
|
Rui Sun
|
Yongkang Huang
|
Chong Long
|
Xiao Liu
|
Xuanyu Lei
|
Jie Tang
|
Minlie Huang
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
With the rapid development of Large Language Models (LLMs), increasing attention has been paid to their safety concerns. Consequently, evaluating the safety of LLMs has become an essential task for facilitating the broad applications of LLMs. Nevertheless, the absence of comprehensive safety evaluation benchmarks poses a significant impediment to effectively assess and enhance the safety of LLMs. In this work, we present SafetyBench, a comprehensive benchmark for evaluating the safety of LLMs, which comprises 11,435 diverse multiple choice questions spanning across 7 distinct categories of safety concerns. Notably, SafetyBench also incorporates both Chinese and English data, facilitating the evaluation in both languages. Our extensive tests over 25 popular Chinese and English LLMs in both zero-shot and few-shot settings reveal a substantial performance advantage for GPT-4 over its counterparts, and there is still significant room for improving the safety of current LLMs. We also demonstrate that the measured safety understanding abilities in SafetyBench are correlated with safety generation abilities. Data and evaluation guidelines are available at https://github.com/thu-coai/SafetyBench. Submission entrance and leaderboard are available at https://llmbench.ai/safety.
Search
Co-authors
- Jie Tang 2
- Minlie Huang 2
- Jinfeng Zhou 1
- Zhuang Chen 1
- Dazhen Wan 1
- show all...