%0 Conference Proceedings %T Extrapolation in NLP %A Mitchell, Jeff %A Stenetorp, Pontus %A Minervini, Pasquale %A Riedel, Sebastian %Y Bisk, Yonatan %Y Levy, Omer %Y Yatskar, Mark %S Proceedings of the Workshop on Generalization in the Age of Deep Learning %D 2018 %8 June %I Association for Computational Linguistics %C New Orleans, Louisiana %F mitchell-etal-2018-extrapolation %X We argue that extrapolation to unseen data will often be easier for models that capture global structures, rather than just maximise their local fit to the training data. We show that this is true for two popular models: the Decomposable Attention Model and word2vec. %R 10.18653/v1/W18-1005 %U https://aclanthology.org/W18-1005 %U https://doi.org/10.18653/v1/W18-1005 %P 28-33