Word Usage Similarity Estimation with Sentence Representations and Automatic Substitutes

Aina Garí Soler, Marianna Apidianaki, Alexandre Allauzen


Abstract
Usage similarity estimation addresses the semantic proximity of word instances in different contexts. We apply contextualized (ELMo and BERT) word and sentence embeddings to this task, and propose supervised models that leverage these representations for prediction. Our models are further assisted by lexical substitute annotations automatically assigned to word instances by context2vec, a neural model that relies on a bidirectional LSTM. We perform an extensive comparison of existing word and sentence representations on benchmark datasets addressing both graded and binary similarity. The best performing models outperform previous methods in both settings.
Anthology ID:
S19-1002
Volume:
Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Rada Mihalcea, Ekaterina Shutova, Lun-Wei Ku, Kilian Evang, Soujanya Poria
Venue:
*SEM
SIGs:
SIGLEX | SIGSEM
Publisher:
Association for Computational Linguistics
Note:
Pages:
9–21
Language:
URL:
https://aclanthology.org/S19-1002
DOI:
10.18653/v1/S19-1002
Bibkey:
Cite (ACL):
Aina Garí Soler, Marianna Apidianaki, and Alexandre Allauzen. 2019. Word Usage Similarity Estimation with Sentence Representations and Automatic Substitutes. In Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019), pages 9–21, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
Word Usage Similarity Estimation with Sentence Representations and Automatic Substitutes (Garí Soler et al., *SEM 2019)
Copy Citation:
PDF:
https://aclanthology.org/S19-1002.pdf