Sanskrit Voyager: Unified Web Platform for Interactive Reading and Linguistic Analysis of Sanskrit Texts

Giacomo De Luca, Danilo Croce, Roberto Basili


Abstract
Sanskrit Voyager is a web application for searching, reading, and analyzing the texts in the Sanskrit literary corpus. Unlike previous tools that require expert linguistic knowledge or manual normalization, Sanskrit Voyager enables users to search for words and phrases as they actually appear in texts, handling inflection, sandhi, and compound forms automatically while supporting any transliteration. The system integrates four core functionalities: (1) multi-dictionary lookup with morphological analysis and inflection tables; (2) real-time text parsing and annotation; (3) an interactive reader for over 900 digitalized texts; and (4) advanced corpus search with fuzzy matching and filtering. Evaluation shows over 92% parsing accuracy on complex compounds and substantially higher search recall than BuddhaNexus on challenging queries. Source code is publicly available under CC-BY-NC license, resource-efficient, and designed for both learners and researchers, offering the first fully integrated, user-friendly platform for computational Sanskrit studies.
Anthology ID:
2025.emnlp-demos.18
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Ivan Habernal, Peter Schulam, Jörg Tiedemann
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
244–253
Language:
URL:
https://aclanthology.org/2025.emnlp-demos.18/
DOI:
Bibkey:
Cite (ACL):
Giacomo De Luca, Danilo Croce, and Roberto Basili. 2025. Sanskrit Voyager: Unified Web Platform for Interactive Reading and Linguistic Analysis of Sanskrit Texts. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 244–253, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Sanskrit Voyager: Unified Web Platform for Interactive Reading and Linguistic Analysis of Sanskrit Texts (De Luca et al., EMNLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.emnlp-demos.18.pdf