A Wolf in Sheep’s Clothing: Generalized Nested Jailbreak Prompts can Fool Large Language Models Easily Peng Ding author Jun Kuang author Dan Ma author Xuezhi Cao author Yunsen Xian author Jiajun Chen author Shujian Huang author 2024-06 text Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) Kevin Duh editor Helena Gomez editor Steven Bethard editor Association for Computational Linguistics Mexico City, Mexico conference publication ding-etal-2024-wolf 10.18653/v1/2024.naacl-long.118 https://aclanthology.org/2024.naacl-long.118/ 2024-06 2136 2153