SafetyBench: Evaluating the Safety of Large Language Models

SafetyBench: Evaluating the Safety of Large Language Models Zhexin Zhang author Leqi Lei author Lindong Wu author Rui Sun author Yongkang Huang author Chong Long author Xiao Liu author Xuanyu Lei author Jie Tang author Minlie Huang author 2024-08 text Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Lun-Wei Ku editor Andre Martins editor Vivek Srikumar editor Association for Computational Linguistics Bangkok, Thailand conference publication zhang-etal-2024-safetybench 10.18653/v1/2024.acl-long.830 https://aclanthology.org/2024.acl-long.830/ 2024-08 15537 15553