Jerone Andrews
2024
Resampled Datasets Are Not Enough: Mitigating Societal Bias Beyond Single Attributes
Yusuke Hirota
|
Jerone Andrews
|
Dora Zhao
|
Orestis Papakyriakopoulos
|
Apostolos Modas
|
Yuta Nakashima
|
Alice Xiang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
We tackle societal bias in image-text datasets by removing spurious correlations between protected groups and image attributes. Traditional methods only target labeled attributes, ignoring biases from unlabeled ones. Using text-guided inpainting models, our approach ensures protected group independence from all attributes and mitigates inpainting biases through data filtering. Evaluations on multi-label image classification and image captioning tasks show our method effectively reduces bias without compromising performance across various models. Specifically, we achieve an average societal bias reduction of 46.1% in leakage-based bias metrics for multi-label classification and 74.8% for image captioning.
Images Speak Louder than Words: Understanding and Mitigating Bias in Vision-Language Model from a Causal Mediation Perspective
Zhaotian Weng
|
Zijun Gao
|
Jerone Andrews
|
Jieyu Zhao
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Vision-language models (VLMs) pre-trained on extensive datasets can inadvertently learn biases by correlating gender information with specific objects or scenarios. Current methods, which focus on modifying inputs and monitoring changes in the model’s output probability scores, often struggle to comprehensively understand bias from the perspective of model components. We propose a framework that incorporates causal mediation analysis to measure and map the pathways of bias generation and propagation within VLMs. Our framework is applicable to a wide range of vision-language and multimodal tasks. In this work, we apply it to the object detection task and implement it on the GLIP model. This approach allows us to identify the direct effects of interventions on model bias and the indirect effects of interventions on bias mediated through different model components. Our results show that image features are the primary contributors to bias, with significantly higher impacts than text features, specifically accounting for 32.57% and 12.63% of the bias in the MSCOCO and PASCAL-SENTENCE datasets, respectively. Notably, the image encoder’s contribution surpasses that of the text encoder and the deep fusion encoder. Further experimentation confirms that contributions from both language and vision modalities are aligned and non-conflicting. Consequently, focusing on blurring gender representations within the image encoder which contributes most to the model bias, reduces bias efficiently by 22.03% and 9.04% in the MSCOCO and PASCAL-SENTENCE datasets, respectively, with minimal performance loss or increased computational demands.
Search
Co-authors
- Yusuke Hirota 1
- Dora Zhao 1
- Orestis Papakyriakopoulos 1
- Apostolos Modas 1
- Yuta Nakashima 1
- show all...