Learning Efficient Task-Specific Meta-Embeddings with Word Prisms

Jingyi He, Kc Tsiolis, Kian Kenyon-Dean, Jackie Chi Kit Cheung


Abstract
Word embeddings are trained to predict word cooccurrence statistics, which leads them to possess different lexical properties (syntactic, semantic, etc.) depending on the notion of context defined at training time. These properties manifest when querying the embedding space for the most similar vectors, and when used at the input layer of deep neural networks trained to solve downstream NLP problems. Meta-embeddings combine multiple sets of differently trained word embeddings, and have been shown to successfully improve intrinsic and extrinsic performance over equivalent models which use just one set of source embeddings. We introduce word prisms: a simple and efficient meta-embedding method that learns to combine source embeddings according to the task at hand. Word prisms learn orthogonal transformations to linearly combine the input source embeddings, which allows them to be very efficient at inference time. We evaluate word prisms in comparison to other meta-embedding methods on six extrinsic evaluations and observe that word prisms offer improvements in performance on all tasks.
Anthology ID:
2020.coling-main.106
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
1229–1241
Language:
URL:
https://aclanthology.org/2020.coling-main.106
DOI:
10.18653/v1/2020.coling-main.106
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2020.coling-main.106.pdf
Code
 kylie-box/word_prisms
Data
ConceptNetSNLISST