On the Vulnerability of Safety Alignment in Open-Access LLMs

On the Vulnerability of Safety Alignment in Open-Access LLMs Jingwei Yi author Rui Ye author Qisi Chen author Bin Zhu author Siheng Chen author Defu Lian author Guangzhong Sun author Xing Xie author Fangzhao Wu author 2024-08 text Findings of the Association for Computational Linguistics: ACL 2024 Lun-Wei Ku editor Andre Martins editor Vivek Srikumar editor Association for Computational Linguistics Bangkok, Thailand conference publication yi-etal-2024-vulnerability 10.18653/v1/2024.findings-acl.549 https://aclanthology.org/2024.findings-acl.549/ 2024-08 9236 9260