Finding BERT’s Idiomatic Key

Vasudevan Nedumpozhimana, John Kelleher


Abstract
Sentence embeddings encode information relating to the usage of idioms in a sentence. This paper reports a set of experiments that combine a probing methodology with input masking to analyse where in a sentence this idiomatic information is taken from, and what form it takes. Our results indicate that BERT’s idiomatic key is primarily found within an idiomatic expression, but also draws on information from the surrounding context. Also, BERT can distinguish between the disruption in a sentence caused by words missing and the incongruity caused by idiomatic usage.
Anthology ID:
2021.mwe-1.7
Volume:
Proceedings of the 17th Workshop on Multiword Expressions (MWE 2021)
Month:
August
Year:
2021
Address:
Online
Editors:
Paul Cook, Jelena Mitrović, Carla Parra Escartín, Ashwini Vaidya, Petya Osenova, Shiva Taslimipoor, Carlos Ramisch
Venue:
MWE
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
57–62
Language:
URL:
https://aclanthology.org/2021.mwe-1.7
DOI:
10.18653/v1/2021.mwe-1.7
Bibkey:
Cite (ACL):
Vasudevan Nedumpozhimana and John Kelleher. 2021. Finding BERT’s Idiomatic Key. In Proceedings of the 17th Workshop on Multiword Expressions (MWE 2021), pages 57–62, Online. Association for Computational Linguistics.
Cite (Informal):
Finding BERT’s Idiomatic Key (Nedumpozhimana & Kelleher, MWE 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.mwe-1.7.pdf