Emile Cornamusaz
2026
Armenian AutoEpiDoc: Automated Extraction and Encoding of Armenian Inscriptions into EpiDoc TEI/XML
Hamest Tamrazyan | Emile Cornamusaz | Emanuela Boros
Proceedings of the 10th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature 2026
Hamest Tamrazyan | Emile Cornamusaz | Emanuela Boros
Proceedings of the 10th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature 2026
Armenian epigraphy is extensively documented in printed scholarly corpora, yet lacks machine-readable editions that support interoperability or computational analysis. In this paper, we present Armenian AutoEpiDoc, a system that automatically converts expert-verified Armenian inscription records into EpiDoc-compliant TEI/XML files. Operating on curated and domain-validated data, AutoEpiDoc maps Armenian-specific metadata to EpiDoc structures through rule-based templates and schema-aware validation. The workflow significantly reduces manual encoding effort and provides a scalable path toward producing digital editions and integrating Armenian inscriptions into international epigraphic infrastructures.