Defending LLMs against Jailbreaking Attacks via Backtranslation Yihan Wang author Zhouxing Shi author Andrew Bai author Cho-Jui Hsieh author 2024-08 text Findings of the Association for Computational Linguistics: ACL 2024 Lun-Wei Ku editor Andre Martins editor Vivek Srikumar editor Association for Computational Linguistics Bangkok, Thailand conference publication wang-etal-2024-defending 10.18653/v1/2024.findings-acl.948 https://aclanthology.org/2024.findings-acl.948/ 2024-08 16031 16046