High-risk learning: acquiring new word vectors from tiny data

Aurélie Herbelot, Marco Baroni


Abstract
Distributional semantics models are known to struggle with small data. It is generally accepted that in order to learn ‘a good vector’ for a word, a model must have sufficient examples of its usage. This contradicts the fact that humans can guess the meaning of a word from a few occurrences only. In this paper, we show that a neural language model such as Word2Vec only necessitates minor modifications to its standard architecture to learn new terms from tiny data, using background knowledge from a previously learnt semantic space. We test our model on word definitions and on a nonce task involving 2-6 sentences’ worth of context, showing a large increase in performance over state-of-the-art models on the definitional task.
Anthology ID:
D17-1030
Volume:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Martha Palmer, Rebecca Hwa, Sebastian Riedel
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
304–309
Language:
URL:
https://aclanthology.org/D17-1030
DOI:
10.18653/v1/D17-1030
Bibkey:
Cite (ACL):
Aurélie Herbelot and Marco Baroni. 2017. High-risk learning: acquiring new word vectors from tiny data. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 304–309, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
High-risk learning: acquiring new word vectors from tiny data (Herbelot & Baroni, EMNLP 2017)
Copy Citation:
PDF:
https://aclanthology.org/D17-1030.pdf
Attachment:
 D17-1030.Attachment.zip
Code
 minimalparts/nonce2vec