How Are Idioms Processed Inside Transformer Language Models?

Ye Tian, Isobel James, Hye Son


Abstract
Idioms such as “call it a day” and “piece of cake,” are prevalent in natural language. How do Transformer language models process idioms? This study examines this question by analysing three models - BERT, Multilingual BERT, and DistilBERT. We compare the embeddings of idiomatic and literal expressions across all layers of the networks at both the sentence and word levels. Additionally, we investigate the attention directed from other sentence tokens towards a word within an idiom as opposed to in a literal context. Results indicate that while the three models exhibit slightly different internal mechanisms, they all represent idioms distinctively compared to literal language, with attention playing a critical role. These findings suggest that idioms are semantically and syntactically idiosyncratic, not only for humans but also for language models.
Anthology ID:
2023.starsem-1.16
Volume:
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Alexis Palmer, Jose Camacho-collados
Venue:
*SEM
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
174–179
Language:
URL:
https://aclanthology.org/2023.starsem-1.16
DOI:
10.18653/v1/2023.starsem-1.16
Bibkey:
Cite (ACL):
Ye Tian, Isobel James, and Hye Son. 2023. How Are Idioms Processed Inside Transformer Language Models?. In Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023), pages 174–179, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
How Are Idioms Processed Inside Transformer Language Models? (Tian et al., *SEM 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.starsem-1.16.pdf