Heber Gustavo Xavier de Castro

2026

Geração de consultas SPARQL a partir de linguagem natural
Heber Gustavo Xavier de Castro | Clever Ricardo Guareis de Farias
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1

The Semantic Web aims to make web data understandable not only to humans but also to machines, enabling more efficient data integration, sharing, and reuse. Linked Open Data (LOD) initiatives have supported this vision by promoting the publication of semantically annotated and interconnected data. However, querying LOD repositories typically requires knowledge of SPARQL, a complex query language that limits access for non-expert users. Although several approaches have been proposed to automatically generate SPARQL queries from natural-language questions, most are designed for English and are tightly coupled to specific domains, which hinders reuse. This article presents a generic, domain-independent approach for generating SPARQL queries from questions written in Portuguese. The proposed method uses reference questions, parameterized query templates, and a synonym dictionary enriched by lexical resources and similarity metrics. The implementation is supported by the Natural2SPARQL tool, and the approach is validated through a case study in the financial domain using real data from the Brazilian stock exchange (B3). The results indicate that the method enables flexible and semantically accurate SPARQL query generation from natural-language input. Unlike learning-based approaches, our method avoids retraining and achieves up to 93.3% end-to-end success in controlled settings, demonstrating robustness and low adaptation cost.

Co-authors

Clever Ricardo Guareis de Farias 1

Venues

PROPOR1

Fix author