Word Sense Distance in Human Similarity Judgements and Contextualised Word Embeddings

Janosch Haber, Massimo Poesio


Abstract
Homonymy is often used to showcase one of the advantages of context-sensitive word embedding techniques such as ELMo and BERT. In this paper we want to shift the focus to the related but less exhaustively explored phenomenon of polysemy, where a word expresses various distinct but related senses in different contexts. Specifically, we aim to i) investigate a recent model of polyseme sense clustering proposed by Ortega-Andres & Vicente (2019) through analysing empirical evidence of word sense grouping in human similarity judgements, ii) extend the evaluation of context-sensitive word embedding systems by examining whether they encode differences in word sense similarity and iii) compare the word sense similarities of both methods to assess their correlation and gain some intuition as to how well contextualised word embeddings could be used as surrogate word sense similarity judgements in linguistic experiments.
Anthology ID:
2020.pam-1.17
Volume:
Proceedings of the Probability and Meaning Conference (PaM 2020)
Month:
June
Year:
2020
Address:
Gothenburg
Editors:
Christine Howes, Stergios Chatzikyriakidis, Adam Ek, Vidya Somashekarappa
Venue:
PaM
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
128–145
Language:
URL:
https://aclanthology.org/2020.pam-1.17
DOI:
Bibkey:
Cite (ACL):
Janosch Haber and Massimo Poesio. 2020. Word Sense Distance in Human Similarity Judgements and Contextualised Word Embeddings. In Proceedings of the Probability and Meaning Conference (PaM 2020), pages 128–145, Gothenburg. Association for Computational Linguistics.
Cite (Informal):
Word Sense Distance in Human Similarity Judgements and Contextualised Word Embeddings (Haber & Poesio, PaM 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.pam-1.17.pdf