A Psycholinguistic Analysis of BERT’s Representations of Compounds

Lars Buijtelaar, Sandro Pezzelle


Abstract
This work studies the semantic representations learned by BERT for compounds, that is, expressions such as sunlight or bodyguard. We build on recent studies that explore semantic information in Transformers at the word level and test whether BERT aligns with human semantic intuitions when dealing with expressions (e.g., sunlight) whose overall meaning depends—to a various extent—on the semantics of the constituent words (sun, light). We leverage a dataset that includes human judgments on two psycholinguistic measures of compound semantic analysis: lexeme meaning dominance (LMD; quantifying the weight of each constituent toward the compound meaning) and semantic transparency (ST; evaluating the extent to which the compound meaning is recoverable from the constituents’ semantics). We show that BERT-based measures moderately align with human intuitions, especially when using contextualized representations, and that LMD is overall more predictable than ST. Contrary to the results reported for ‘standard’ words, higher, more contextualized layers are the best at representing compound meaning. These findings shed new light on the abilities of BERT in dealing with fine-grained semantic phenomena. Moreover, they can provide insights into how speakers represent compounds.
Anthology ID:
2023.eacl-main.163
Volume:
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
Month:
May
Year:
2023
Address:
Dubrovnik, Croatia
Editors:
Andreas Vlachos, Isabelle Augenstein
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2230–2241
Language:
URL:
https://aclanthology.org/2023.eacl-main.163
DOI:
10.18653/v1/2023.eacl-main.163
Bibkey:
Cite (ACL):
Lars Buijtelaar and Sandro Pezzelle. 2023. A Psycholinguistic Analysis of BERT’s Representations of Compounds. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 2230–2241, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):
A Psycholinguistic Analysis of BERT’s Representations of Compounds (Buijtelaar & Pezzelle, EACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.eacl-main.163.pdf
Video:
 https://aclanthology.org/2023.eacl-main.163.mp4