Learning Word Meta-Embeddings by Autoencoding

Danushka Bollegala, Cong Bao


Abstract
Distributed word embeddings have shown superior performances in numerous Natural Language Processing (NLP) tasks. However, their performances vary significantly across different tasks, implying that the word embeddings learnt by those methods capture complementary aspects of lexical semantics. Therefore, we believe that it is important to combine the existing word embeddings to produce more accurate and complete meta-embeddings of words. We model the meta-embedding learning problem as an autoencoding problem, where we would like to learn a meta-embedding space that can accurately reconstruct all source embeddings simultaneously. Thereby, the meta-embedding space is enforced to capture complementary information in different source embeddings via a coherent common embedding space. We propose three flavours of autoencoded meta-embeddings motivated by different requirements that must be satisfied by a meta-embedding. Our experimental results on a series of benchmark evaluations show that the proposed autoencoded meta-embeddings outperform the existing state-of-the-art meta-embeddings in multiple tasks.
Anthology ID:
C18-1140
Volume:
Proceedings of the 27th International Conference on Computational Linguistics
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Editors:
Emily M. Bender, Leon Derczynski, Pierre Isabelle
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1650–1661
Language:
URL:
https://aclanthology.org/C18-1140/
DOI:
Bibkey:
Cite (ACL):
Danushka Bollegala and Cong Bao. 2018. Learning Word Meta-Embeddings by Autoencoding. In Proceedings of the 27th International Conference on Computational Linguistics, pages 1650–1661, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Learning Word Meta-Embeddings by Autoencoding (Bollegala & Bao, COLING 2018)
Copy Citation:
PDF:
https://aclanthology.org/C18-1140.pdf
Code
 CongBao/AutoencodedMetaEmbedding