AutoExtend: Combining Word Embeddings with Semantic Resources

Sascha Rothe, Hinrich Schütze


Abstract
We present AutoExtend, a system that combines word embeddings with semantic resources by learning embeddings for non-word objects like synsets and entities and learning word embeddings that incorporate the semantic information from the resource. The method is based on encoding and decoding the word embeddings and is flexible in that it can take any word embeddings as input and does not need an additional training corpus. The obtained embeddings live in the same vector space as the input word embeddings. A sparse tensor formalization guarantees efficiency and parallelizability. We use WordNet, GermaNet, and Freebase as semantic resources. AutoExtend achieves state-of-the-art performance on Word-in-Context Similarity and Word Sense Disambiguation tasks.
Anthology ID:
J17-3004
Volume:
Computational Linguistics, Volume 43, Issue 3 - September 2017
Month:
September
Year:
2017
Address:
Cambridge, MA
Venue:
CL
SIG:
Publisher:
MIT Press
Note:
Pages:
593–617
Language:
URL:
https://aclanthology.org/J17-3004
DOI:
10.1162/COLI_a_00294
Bibkey:
Cite (ACL):
Sascha Rothe and Hinrich Schütze. 2017. AutoExtend: Combining Word Embeddings with Semantic Resources. Computational Linguistics, 43(3):593–617.
Cite (Informal):
AutoExtend: Combining Word Embeddings with Semantic Resources (Rothe & Schütze, CL 2017)
Copy Citation:
PDF:
https://aclanthology.org/J17-3004.pdf