Breno Dourado Sá


2022

pdf bib
Enhancing Geocoding of Adjectival Toponyms With Heuristics
Breno Dourado Sá | Ticiana Coelho da Silva | Jose Antonio Fernandes de Macedo
Proceedings of the LREC 2022 workshop on Natural Language Processing for Political Sciences

Unstructured text documents such as news and blogs often present references to places. Those references, called toponyms, can be used in various applications like disaster warning and touristic planning. However, obtaining the correct coordinates for toponyms, called geocoding, is not easy since it’s common for places to have the same name as other locations. The process becomes even more challenging when toponyms appear in adjectival form, as they are different from the place’s actual name. This paper addresses the geocoding task and aims to improve, through a heuristic approach, the process for adjectival toponyms. So first, a baseline geocoder is defined through experimenting with a set of heuristics. After that, the baseline is enhanced by adding a normalization step to map adjectival toponyms to their noun form at the beginning of the geocoding process. The results show improved performance for the enhanced geocoder compared to the baseline and other geocoders.