Rainbow - A Benchmark for Systematic Testing of How Sensitive Visio-Linguistic Models are to Color Naming

Marie Bexte; Andrea Horbach; Torsten Zesch

doi:10.18653/v1/2024.eacl-long.112

Rainbow - A Benchmark for Systematic Testing of How Sensitive Visio-Linguistic Models are to Color Naming

Marie Bexte, Andrea Horbach, Torsten Zesch

Abstract

With the recent emergence of powerful visio-linguistic models comes the question of how fine-grained their multi-modal understanding is. This has lead to the release of several probing datasets. Results point towards models having trouble with prepositions and verbs, but being relatively robust when it comes to color.To gauge how deep this understanding goes, we compile a comprehensive probing dataset to systematically test multi-modal alignment around color. We demonstrate how human perception influences descriptions of color and pay special attention to the extent to which this is reflected within the predictions of a visio-linguistic model. Probing a set of models with diverse properties with our benchmark confirms the superiority of models that do not rely on pre-extracted image features, and demonstrates that augmentation with too much noisy pre-training data can produce an inferior model. While the benchmark remains challenging for all models we test, the overall result pattern suggests well-founded alignment of color terms with hues. Analyses do however reveal uncertainty regarding the boundaries between neighboring color terms.

Anthology ID:: 2024.eacl-long.112
Volume:: Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: March
Year:: 2024
Address:: St. Julian’s, Malta
Editors:: Yvette Graham, Matthew Purver
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1858–1875
Language:
URL:: https://aclanthology.org/2024.eacl-long.112/
DOI:: 10.18653/v1/2024.eacl-long.112
Bibkey:
Cite (ACL):: Marie Bexte, Andrea Horbach, and Torsten Zesch. 2024. Rainbow - A Benchmark for Systematic Testing of How Sensitive Visio-Linguistic Models are to Color Naming. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1858–1875, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):: Rainbow - A Benchmark for Systematic Testing of How Sensitive Visio-Linguistic Models are to Color Naming (Bexte et al., EACL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.eacl-long.112.pdf
Video:: https://aclanthology.org/2024.eacl-long.112.mp4

PDF Cite Search Video Fix data