Where on Earth Do Users Say They Are?: Geo-Entity Linking for Noisy Multilingual User Input

Tessa Masis, Brendan O’Connor


Abstract
Geo-entity linking is the task of linking a location mention to the real-world geographic location. In this we explore the challenging task of geo-entity linking for noisy, multilingual social media data. There are few open-source multilingual geo-entity linking tools available and existing ones are often rule-based, which break easily in social media settings, or LLM-based, which are too expensive for large-scale datasets. We present a method which represents real-world locations as averaged embeddings from labeled user-input location names and allows for selective prediction via an interpretable confidence score. We show that our approach improves geo-entity linking on a global and multilingual social media dataset, and discuss progress and problems with evaluating at different geographic granularities.
Anthology ID:
2024.nlpcss-1.7
Volume:
Proceedings of the Sixth Workshop on Natural Language Processing and Computational Social Science (NLP+CSS 2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Dallas Card, Anjalie Field, Dirk Hovy, Katherine Keith
Venues:
NLP+CSS | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
86–98
Language:
URL:
https://aclanthology.org/2024.nlpcss-1.7
DOI:
10.18653/v1/2024.nlpcss-1.7
Bibkey:
Cite (ACL):
Tessa Masis and Brendan O’Connor. 2024. Where on Earth Do Users Say They Are?: Geo-Entity Linking for Noisy Multilingual User Input. In Proceedings of the Sixth Workshop on Natural Language Processing and Computational Social Science (NLP+CSS 2024), pages 86–98, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Where on Earth Do Users Say They Are?: Geo-Entity Linking for Noisy Multilingual User Input (Masis & O’Connor, NLP+CSS-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.nlpcss-1.7.pdf