Learning Composition Models for Phrase Embeddings

Mo Yu, Mark Dredze


Abstract
Lexical embeddings can serve as useful representations for words for a variety of NLP tasks, but learning embeddings for phrases can be challenging. While separate embeddings are learned for each word, this is infeasible for every phrase. We construct phrase embeddings by learning how to compose word embeddings using features that capture phrase structure and context. We propose efficient unsupervised and task-specific learning objectives that scale our model to large datasets. We demonstrate improvements on both language modeling and several phrase semantic similarity tasks with various phrase lengths. We make the implementation of our model and the datasets available for general use.
Anthology ID:
Q15-1017
Volume:
Transactions of the Association for Computational Linguistics, Volume 3
Month:
Year:
2015
Address:
Cambridge, MA
Editors:
Michael Collins, Lillian Lee
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
227–242
Language:
URL:
https://aclanthology.org/Q15-1017
DOI:
10.1162/tacl_a_00135
Bibkey:
Cite (ACL):
Mo Yu and Mark Dredze. 2015. Learning Composition Models for Phrase Embeddings. Transactions of the Association for Computational Linguistics, 3:227–242.
Cite (Informal):
Learning Composition Models for Phrase Embeddings (Yu & Dredze, TACL 2015)
Copy Citation:
PDF:
https://aclanthology.org/Q15-1017.pdf
Code
 Gorov/FCT_PhraseSim_TACL