Bad Form: Comparing Context-Based and Form-Based Few-Shot Learning in Distributional Semantic Models

Jeroen Van Hautte; Guy Emerson; Marek Rei

doi:10.18653/v1/D19-6104

Bad Form: Comparing Context-Based and Form-Based Few-Shot Learning in Distributional Semantic Models

Jeroen Van Hautte, Guy Emerson, Marek Rei

Abstract

Word embeddings are an essential component in a wide range of natural language processing applications. However, distributional semantic models are known to struggle when only a small number of context sentences are available. Several methods have been proposed to obtain higher-quality vectors for these words, leveraging both this context information and sometimes the word forms themselves through a hybrid approach. We show that the current tasks do not suffice to evaluate models that use word-form information, as such models can easily leverage word forms in the training data that are related to word forms in the test data. We introduce 3 new tasks, allowing for a more balanced comparison between models. Furthermore, we show that hyperparameters that have largely been ignored in previous work can consistently improve the performance of both baseline and advanced models, achieving a new state of the art on 4 out of 6 tasks.

Anthology ID:: D19-6104
Volume:: Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)
Month:: November
Year:: 2019
Address:: Hong Kong, China
Editors:: Colin Cherry, Greg Durrett, George Foster, Reza Haffari, Shahram Khadivi, Nanyun Peng, Xiang Ren, Swabha Swayamdipta
Venue:: WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 31–39
Language:
URL:: https://aclanthology.org/D19-6104/
DOI:: 10.18653/v1/D19-6104
Bibkey:
Cite (ACL):: Jeroen Van Hautte, Guy Emerson, and Marek Rei. 2019. Bad Form: Comparing Context-Based and Form-Based Few-Shot Learning in Distributional Semantic Models. In Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019), pages 31–39, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):: Bad Form: Comparing Context-Based and Form-Based Few-Shot Learning in Distributional Semantic Models (Van Hautte et al., 2019)
Copy Citation:
PDF:: https://aclanthology.org/D19-6104.pdf

PDF Cite Search Fix data