Still a Pain in the Neck: Evaluating Text Representations on Lexical Composition

Vered Shwartz, Ido Dagan


Abstract
Building meaningful phrase representations is challenging because phrase meanings are not simply the sum of their constituent meanings. Lexical composition can shift the meanings of the constituent words and introduce implicit information. We tested a broad range of textual representations for their capacity to address these issues. We found that, as expected, contextualized word representations perform better than static word embeddings, more so on detecting meaning shift than in recovering implicit information, in which their performance is still far from that of humans. Our evaluation suite, consisting of six tasks related to lexical composition effects, can serve future research aiming to improve representations.
Anthology ID:
Q19-1027
Volume:
Transactions of the Association for Computational Linguistics, Volume 7
Month:
Year:
2019
Address:
Cambridge, MA
Editors:
Lillian Lee, Mark Johnson, Brian Roark, Ani Nenkova
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
403–419
Language:
URL:
https://aclanthology.org/Q19-1027
DOI:
10.1162/tacl_a_00277
Bibkey:
Cite (ACL):
Vered Shwartz and Ido Dagan. 2019. Still a Pain in the Neck: Evaluating Text Representations on Lexical Composition. Transactions of the Association for Computational Linguistics, 7:403–419.
Cite (Informal):
Still a Pain in the Neck: Evaluating Text Representations on Lexical Composition (Shwartz & Dagan, TACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/Q19-1027.pdf
Code
 vered1986/lexcomp
Data
STREUSLE