A Multi-dimensional study on Bias in Vision-Language models

Gabriele Ruggeri, Debora Nozza


Abstract
In recent years, joint Vision-Language (VL) models have increased in popularity and capability. Very few studies have attempted to investigate bias in VL models, even though it is a well-known issue in both individual modalities. This paper presents the first multi-dimensional analysis of bias in English VL models, focusing on gender, ethnicity, and age as dimensions. When subjects are input as images, pre-trained VL models complete a neutral template with a hurtful word 5% of the time, with higher percentages for female and young subjects. Bias presence in downstream models has been tested on Visual Question Answering. We developed a novel bias metric called the Vision-Language Association Test based on questions designed to elicit biased associations between stereotypical concepts and targets. Our findings demonstrate that pre-trained VL models contain biases that are perpetuated in downstream tasks.
Anthology ID:
2023.findings-acl.403
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6445–6455
Language:
URL:
https://aclanthology.org/2023.findings-acl.403
DOI:
10.18653/v1/2023.findings-acl.403
Bibkey:
Cite (ACL):
Gabriele Ruggeri and Debora Nozza. 2023. A Multi-dimensional study on Bias in Vision-Language models. In Findings of the Association for Computational Linguistics: ACL 2023, pages 6445–6455, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
A Multi-dimensional study on Bias in Vision-Language models (Ruggeri & Nozza, Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-acl.403.pdf
Video:
 https://aclanthology.org/2023.findings-acl.403.mp4