Improving Toponym Resolution with Better Candidate Generation, Transformer-based Reranking, and Two-Stage Resolution

Zeyu Zhang, Steven Bethard


Abstract
Geocoding is the task of converting location mentions in text into structured data that encodes the geospatial semantics. We propose a new architecture for geocoding, GeoNorm. GeoNorm first uses information retrieval techniques to generate a list of candidate entries from the geospatial ontology. Then it reranks the candidate entries using a transformer-based neural network that incorporates information from the ontology such as the entry’s population. This generate-and-rerank process is applied twice: first to resolve the less ambiguous countries, states, and counties, and second to resolve the remaining location mentions, using the identified countries, states, and counties as context. Our proposed toponym resolution framework achieves state-of-the-art performance on multiple datasets. Code and models are available at \url{https://github.com/clulab/geonorm}.
Anthology ID:
2023.starsem-1.6
Volume:
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Alexis Palmer, Jose Camacho-collados
Venue:
*SEM
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
48–60
Language:
URL:
https://aclanthology.org/2023.starsem-1.6
DOI:
10.18653/v1/2023.starsem-1.6
Bibkey:
Cite (ACL):
Zeyu Zhang and Steven Bethard. 2023. Improving Toponym Resolution with Better Candidate Generation, Transformer-based Reranking, and Two-Stage Resolution. In Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023), pages 48–60, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Improving Toponym Resolution with Better Candidate Generation, Transformer-based Reranking, and Two-Stage Resolution (Zhang & Bethard, *SEM 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.starsem-1.6.pdf