How do Transformer Embeddings Represent Compositions? A Functional Analysis

Aishik Nagar; Ishaan Singh Rawal; Mansi Dhanania; Cheston Tan

doi:10.18653/v1/2025.findings-acl.1104

How do Transformer Embeddings Represent Compositions? A Functional Analysis

Aishik Nagar, Ishaan Singh Rawal, Mansi Dhanania, Cheston Tan

Abstract

Compositionality is a key aspect of human intelligence, essential for reasoning and generalization. While transformer-based models have become the de facto standard for many language modeling tasks, little is known about how they represent compound words, and whether these representations are compositional. In this study, we test compositionality in Mistral, OpenAI Large, and Google embedding models, and compare them with BERT. First, we evaluate compositionality in the representations by examining six diverse models of compositionality (addition, multiplication, dilation, regression, etc.). We find that ridge regression, albeit linear, best accounts for compositionality. Surprisingly, we find that the classic vector addition model performs almost as well as any other model. Next, we verify that most embedding models are highly compositional, while BERT shows much poorer compositionality. We verify and visualize our findings with a synthetic dataset consisting of fully transparent adjective-noun compositions. Overall, we present a thorough investigation of compositionality.

Anthology ID:: 2025.findings-acl.1104
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 21444–21461
Language:
URL:: https://aclanthology.org/2025.findings-acl.1104/
DOI:: 10.18653/v1/2025.findings-acl.1104
Bibkey:
Cite (ACL):: Aishik Nagar, Ishaan Singh Rawal, Mansi Dhanania, and Cheston Tan. 2025. How do Transformer Embeddings Represent Compositions? A Functional Analysis. In Findings of the Association for Computational Linguistics: ACL 2025, pages 21444–21461, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: How do Transformer Embeddings Represent Compositions? A Functional Analysis (Nagar et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-acl.1104.pdf

PDF Cite Search Fix data