Unveiling the Implicit Toxicity in Large Language Models

Unveiling the Implicit Toxicity in Large Language Models Jiaxin Wen author Pei Ke author Hao Sun author Zhexin Zhang author Chengfei Li author Jinfeng Bai author Minlie Huang author 2023-12 text Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing Houda Bouamor editor Juan Pino editor Kalika Bali editor Association for Computational Linguistics Singapore conference publication wen-etal-2023-unveiling 10.18653/v1/2023.emnlp-main.84 https://aclanthology.org/2023.emnlp-main.84/ 2023-12 1322 1338