LSTMEmbed: Learning Word and Sense Representations from a Large Semantically Annotated Corpus with Long Short-Term Memories

Ignacio Iacobacci, Roberto Navigli


Abstract
While word embeddings are now a de facto standard representation of words in most NLP tasks, recently the attention has been shifting towards vector representations which capture the different meanings, i.e., senses, of words. In this paper we explore the capabilities of a bidirectional LSTM model to learn representations of word senses from semantically annotated corpora. We show that the utilization of an architecture that is aware of word order, like an LSTM, enables us to create better representations. We assess our proposed model on various standard benchmarks for evaluating semantic representations, reaching state-of-the-art performance on the SemEval-2014 word-to-sense similarity task. We release the code and the resulting word and sense embeddings at http://lcl.uniroma1.it/LSTMEmbed.
Anthology ID:
P19-1165
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Anna Korhonen, David Traum, Lluís Màrquez
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1685–1695
Language:
URL:
https://aclanthology.org/P19-1165
DOI:
10.18653/v1/P19-1165
Bibkey:
Cite (ACL):
Ignacio Iacobacci and Roberto Navigli. 2019. LSTMEmbed: Learning Word and Sense Representations from a Large Semantically Annotated Corpus with Long Short-Term Memories. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1685–1695, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
LSTMEmbed: Learning Word and Sense Representations from a Large Semantically Annotated Corpus with Long Short-Term Memories (Iacobacci & Navigli, ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-1165.pdf
Video:
 https://vimeo.com/384489801