Exploring Similarity Measures and Intertextuality in Vedic Sanskrit Literature

So Miyagawa, Yuki Kyogoku, Yuzuki Tsukagoshi, Kyoko Amano


Abstract
This paper examines semantic similarity and intertextuality in selected texts from the Vedic Sanskrit corpus, specifically the Maitrāyaṇī Saṃhitā (MS) and Kāṭhaka-Saṃhitā (KS). Three computational methods are employed: Word2Vec for word embeddings, stylo package for stylometric analysis, and TRACER for text reuse detection. By comparing various sections of the texts at different granularities, patterns of similarity and structural alignment are uncovered, providing insights into textual relationships and chronology. Word embeddings capture semantic similarities, while stylometric analysis reveals clusters and components that differentiate the texts. TRACER identifies parallel passages, indicating probable instances of text reuse. The computational analysis corroborates previous philological studies, suggesting a shared period of composition between MS.1.9 and MS.1.7. This research highlights the potential of computational methods in studying ancient Sanskrit literature, complementing traditional approaches. The agreement among the methods strengthens the validity of the findings, and the visualizations offer a nuanced understanding of textual connections. The study demonstrates that smaller chunk sizes are more effective for detecting intertextual parallels, showcasing the power of these techniques in unraveling the complexities of ancient texts.
Anthology ID:
2024.nlp4dh-1.12
Volume:
Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities
Month:
November
Year:
2024
Address:
Miami, USA
Editors:
Mika Hämäläinen, Emily Öhman, So Miyagawa, Khalid Alnajjar, Yuri Bizzoni
Venue:
NLP4DH
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
123–131
Language:
URL:
https://aclanthology.org/2024.nlp4dh-1.12
DOI:
Bibkey:
Cite (ACL):
So Miyagawa, Yuki Kyogoku, Yuzuki Tsukagoshi, and Kyoko Amano. 2024. Exploring Similarity Measures and Intertextuality in Vedic Sanskrit Literature. In Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities, pages 123–131, Miami, USA. Association for Computational Linguistics.
Cite (Informal):
Exploring Similarity Measures and Intertextuality in Vedic Sanskrit Literature (Miyagawa et al., NLP4DH 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.nlp4dh-1.12.pdf