ToViLaG: Your Visual-Language Generative Model is Also An Evildoer

Xinpeng Wang; Xiaoyuan Yi; Han Jiang; Shanlin Zhou; Zhihua Wei; Xing Xie

doi:10.18653/v1/2023.emnlp-main.213

ToViLaG: Your Visual-Language Generative Model is Also An Evildoer

Xinpeng Wang, Xiaoyuan Yi, Han Jiang, Shanlin Zhou, Zhihua Wei, Xing Xie

Abstract

Recent large-scale Visual-Language Generative Models (VLGMs) have achieved unprecedented improvement in multimodal image/text generation. However, these models might also generate toxic content, e.g., offensive text and pornography images, raising significant ethical risks. Despite exhaustive studies on toxic degeneration of language models, this problem remains largely unexplored within the context of visual-language generation. This work delves into the propensity for toxicity generation and susceptibility to toxic data across various VLGMs. For this purpose, we built ToViLaG, a dataset comprising 32K co-toxic/mono-toxic text-image pairs and 1K innocuous but evocative text that tends to stimulate toxicity. Furthermore, we propose WInToRe, a novel toxicity metric tailored to visual-language generation, which theoretically reflects different aspects of toxicity considering both input and output. On such a basis, we benchmarked the toxicity of a diverse spectrum of VLGMs and discovered that some models do more evil than expected while some are more vulnerable to infection, underscoring the necessity of VLGMs detoxification. Therefore, we develop an innovative bottleneck-based detoxification method. Our method could reduce toxicity while maintaining comparable generation quality, providing a promising initial solution to this line of research.

Anthology ID:: 2023.emnlp-main.213
Volume:: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3508–3533
Language:
URL:: https://aclanthology.org/2023.emnlp-main.213/
DOI:: 10.18653/v1/2023.emnlp-main.213
Bibkey:
Cite (ACL):: Xinpeng Wang, Xiaoyuan Yi, Han Jiang, Shanlin Zhou, Zhihua Wei, and Xing Xie. 2023. ToViLaG: Your Visual-Language Generative Model is Also An Evildoer. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 3508–3533, Singapore. Association for Computational Linguistics.
Cite (Informal):: ToViLaG: Your Visual-Language Generative Model is Also An Evildoer (Wang et al., EMNLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.emnlp-main.213.pdf
Video:: https://aclanthology.org/2023.emnlp-main.213.mp4

PDF Cite Search Video Fix data