Lost in Disambiguation: How Instruction-Tuned LLMs Master Lexical Ambiguity

Luca Capone, Serena Auriemma, Martina Miliani, Alessandro Bondielli, Alessandro Lenci


Abstract
This paper investigates how decoder-only instruction-tuned LLMs handle lexical ambiguity. Two distinct methodologies are employed: Eliciting rating scores from the model via prompting and analysing the cosine similarity between pairs of polysemous words in context. Ratings and embeddings are obtained by providing pairs of sentences from Haber and Poesio (2021) to the model. These ratings and cosine similarity scores are compared with each other and with the human similarity judgments in the dataset.Surprisingly, the model scores show only a moderate correlation with the subjects’ similarity judgments and no correlation with the target word embedding similarities. A vector space anisotropy inspection has also been performed, as a potential source of the experimental results. The analysis reveals that the embedding spaces of two out of the three analyzed models exhibit poor anisotropy, while the third model shows relatively moderate anisotropy compared to previous findings for models with similar architecture (Ethayarajh 2019). These findings offer new insights into the relationship between generation quality and vector representations in decoder-only LLMs.
Anthology ID:
2024.clicit-1.19
Volume:
Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024)
Month:
December
Year:
2024
Address:
Pisa, Italy
Editors:
Felice Dell'Orletta, Alessandro Lenci, Simonetta Montemagni, Rachele Sprugnoli
Venue:
CLiC-it
SIG:
Publisher:
CEUR Workshop Proceedings
Note:
Pages:
148–156
Language:
URL:
https://aclanthology.org/2024.clicit-1.19/
DOI:
Bibkey:
Cite (ACL):
Luca Capone, Serena Auriemma, Martina Miliani, Alessandro Bondielli, and Alessandro Lenci. 2024. Lost in Disambiguation: How Instruction-Tuned LLMs Master Lexical Ambiguity. In Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024), pages 148–156, Pisa, Italy. CEUR Workshop Proceedings.
Cite (Informal):
Lost in Disambiguation: How Instruction-Tuned LLMs Master Lexical Ambiguity (Capone et al., CLiC-it 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.clicit-1.19.pdf