Geração de consultas SPARQL a partir de linguagem natural

Heber Gustavo Xavier de Castro, Clever Ricardo Guareis de Farias


Abstract
The Semantic Web aims to make web data understandable not only to humans but also to machines, enabling more efficient data integration, sharing, and reuse. Linked Open Data (LOD) initiatives have supported this vision by promoting the publication of semantically annotated and interconnected data. However, querying LOD repositories typically requires knowledge of SPARQL, a complex query language that limits access for non-expert users. Although several approaches have been proposed to automatically generate SPARQL queries from natural-language questions, most are designed for English and are tightly coupled to specific domains, which hinders reuse. This article presents a generic, domain-independent approach for generating SPARQL queries from questions written in Portuguese. The proposed method uses reference questions, parameterized query templates, and a synonym dictionary enriched by lexical resources and similarity metrics. The implementation is supported by the Natural2SPARQL tool, and the approach is validated through a case study in the financial domain using real data from the Brazilian stock exchange (B3). The results indicate that the method enables flexible and semantically accurate SPARQL query generation from natural-language input. Unlike learning-based approaches, our method avoids retraining and achieves up to 93.3% end-to-end success in controlled settings, demonstrating robustness and low adaptation cost.
Anthology ID:
2026.propor-1.67
Volume:
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Month:
April
Year:
2026
Address:
Salvador, Brazil
Editors:
Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
Venue:
PROPOR
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
676–686
Language:
URL:
https://aclanthology.org/2026.propor-1.67/
DOI:
Bibkey:
Cite (ACL):
Heber Gustavo Xavier de Castro and Clever Ricardo Guareis de Farias. 2026. Geração de consultas SPARQL a partir de linguagem natural. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1, pages 676–686, Salvador, Brazil. Association for Computational Linguistics.
Cite (Informal):
Geração de consultas SPARQL a partir de linguagem natural (Castro & Farias, PROPOR 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.propor-1.67.pdf