A Lightweight String Based Method of Encoding Etymologies in Linked Data Lexical Resources

Anas Fahad Khan, Maxim Ionov, Paola Marongiu, Ana Salgado


Abstract
In this submission we propose an approach to encoding etymological information as strings (“etymology strings”). We begin by discussing the advantages of such an approach with respect to one in which etymologies and etymons are explicitly represented as RDF individuals. Next we give a formal description of the regular language underlying our approach as an Extended Backus-Naur Form grammar (EBNF). We use the Chamuça Hindi lexicon as a test case for our approach and show some of the kinds of SPARQL queries which can be made using etymological strings.
Anthology ID:
2025.ontolex-1.4
Volume:
Proceedings of the 5th Conference on Language, Data and Knowledge: The 5th OntoLex Workshop
Month:
September
Year:
2025
Address:
Naples, Italy
Editors:
Katerina Gkirtzou, Slavko Žitnik, Jorge Gracia, Dagmar Gromann, Maria Pia di Buono, Johanna Monti, Maxim Ionov
Venues:
ontolex | WS
SIG:
Publisher:
Unior Press
Note:
Pages:
30–34
Language:
URL:
https://aclanthology.org/2025.ontolex-1.4/
DOI:
Bibkey:
Cite (ACL):
Anas Fahad Khan, Maxim Ionov, Paola Marongiu, and Ana Salgado. 2025. A Lightweight String Based Method of Encoding Etymologies in Linked Data Lexical Resources. In Proceedings of the 5th Conference on Language, Data and Knowledge: The 5th OntoLex Workshop, pages 30–34, Naples, Italy. Unior Press.
Cite (Informal):
A Lightweight String Based Method of Encoding Etymologies in Linked Data Lexical Resources (Khan et al., ontolex 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.ontolex-1.4.pdf