InstructSafety: A Unified Framework for Building Multidimensional and Explainable Safety Detector through Instruction Tuning

InstructSafety: A Unified Framework for Building Multidimensional and Explainable Safety Detector through Instruction Tuning Zhexin Zhang author Jiale Cheng author Hao Sun author Jiawen Deng author Minlie Huang author 2023-12 text Findings of the Association for Computational Linguistics: EMNLP 2023 Houda Bouamor editor Juan Pino editor Kalika Bali editor Association for Computational Linguistics Singapore conference publication zhang-etal-2023-instructsafety 10.18653/v1/2023.findings-emnlp.700 https://aclanthology.org/2023.findings-emnlp.700/ 2023-12 10421 10436