Geographical Evaluation of Word Embeddings

Michal Konkol, Tomáš Brychcín, Michal Nykl, Tomáš Hercig


Abstract
Word embeddings are commonly compared either with human-annotated word similarities or through improvements in natural language processing tasks. We propose a novel principle which compares the information from word embeddings with reality. We implement this principle by comparing the information in the word embeddings with geographical positions of cities. Our evaluation linearly transforms the semantic space to optimally fit the real positions of cities and measures the deviation between the position given by word embeddings and the real position. A set of well-known word embeddings with state-of-the-art results were evaluated. We also introduce a visualization that helps with error analysis.
Anthology ID:
I17-1023
Volume:
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:
November
Year:
2017
Address:
Taipei, Taiwan
Editors:
Greg Kondrak, Taro Watanabe
Venue:
IJCNLP
SIG:
Publisher:
Asian Federation of Natural Language Processing
Note:
Pages:
224–232
Language:
URL:
https://aclanthology.org/I17-1023
DOI:
Bibkey:
Cite (ACL):
Michal Konkol, Tomáš Brychcín, Michal Nykl, and Tomáš Hercig. 2017. Geographical Evaluation of Word Embeddings. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 224–232, Taipei, Taiwan. Asian Federation of Natural Language Processing.
Cite (Informal):
Geographical Evaluation of Word Embeddings (Konkol et al., IJCNLP 2017)
Copy Citation:
PDF:
https://aclanthology.org/I17-1023.pdf