Huntsville, hospitals, and hockey teams: Names can reveal your location

Bahar Salehi, Dirk Hovy, Eduard Hovy, Anders Søgaard


Abstract
Geolocation is the task of identifying a social media user’s primary location, and in natural language processing, there is a growing literature on to what extent automated analysis of social media posts can help. However, not all content features are equally revealing of a user’s location. In this paper, we evaluate nine name entity (NE) types. Using various metrics, we find that GEO-LOC, FACILITY and SPORT-TEAM are more informative for geolocation than other NE types. Using these types, we improve geolocation accuracy and reduce distance error over various famous text-based methods.
Anthology ID:
W17-4415
Volume:
Proceedings of the 3rd Workshop on Noisy User-generated Text
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Leon Derczynski, Wei Xu, Alan Ritter, Tim Baldwin
Venue:
WNUT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
116–121
Language:
URL:
https://aclanthology.org/W17-4415/
DOI:
10.18653/v1/W17-4415
Bibkey:
Cite (ACL):
Bahar Salehi, Dirk Hovy, Eduard Hovy, and Anders Søgaard. 2017. Huntsville, hospitals, and hockey teams: Names can reveal your location. In Proceedings of the 3rd Workshop on Noisy User-generated Text, pages 116–121, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Huntsville, hospitals, and hockey teams: Names can reveal your location (Salehi et al., WNUT 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-4415.pdf