Victoria Muñoz-Garcia
2025
Revealing Gender Bias in Language Models through Fashion Image Captioning
Maria Villalba-Oses
|
Victoria Muñoz-Garcia
|
Juan Pablo Consuegra-Ayala
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Image captioning bridges computer vision and natural language processing but remains vulnerable to social biases. This study evaluates gender bias in ChatGPT, Copilot, and Grok by analyzing their descriptions of fashion-related images prompted without gender cues. We introduce a methodology combining gender annotation, stereotype classification, and a manually curated dataset. Results show that GPT-4o and Grok frequently assign gender and reinforce stereotypes, while Copilot more often generates neutral captions. Grok shows the lowest error rate but consistently assigns gender, even when cues are ambiguous. These findings highlight the need for bias-aware captioning approaches in multimodal systems.