From Corpus to Concept Scheme: Developing a SKOS Vocabulary for Armenian Epigraphic Heritage

Hamest Tamrazyan, Kamal Nour, Emanuela Boros


Abstract
Armenian epigraphy, one of the world’s oldest and most diverse inscriptional traditions, remains largely absent from digital research infrastructures due to a lack of basic linguistic and conceptual resources. No machine-readable corpus, standardized terminology, or controlled vocabulary exists for describing Armenian inscription types, preventing indexing and interoperability. This paper addresses this gap by constructing the first dataset of Armenian inscription-type terminology and by developing a computational pipeline for analyzing it at scale. We digitize and preprocess a broad corpus of authoritative printed publications; curate a culturally grounded terminology list; and train transformer-based NER models to identify both attested inscription types and potential terminological variants across unseen texts. The resulting resources form the first empirical foundation for modelling Armenian epigraphic concepts needed for further developing a SKOS vocabulary aligned with, yet culturally distinct from, existing international epigraphic ontologies.
Anthology ID:
2026.latechclfl-1.1
Volume:
Proceedings of the 10th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature 2026
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Diego Alves, Yuri Bizzoni, Stefania Degaetano-Ortlieb, Anna Kazantseva, Janis Pagel, Stan Szpakowicz
Venues:
LaTeCH-CLfL | WS
SIG:
SIGHUM
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–10
Language:
URL:
https://aclanthology.org/2026.latechclfl-1.1/
DOI:
Bibkey:
Cite (ACL):
Hamest Tamrazyan, Kamal Nour, and Emanuela Boros. 2026. From Corpus to Concept Scheme: Developing a SKOS Vocabulary for Armenian Epigraphic Heritage. In Proceedings of the 10th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature 2026, pages 1–10, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
From Corpus to Concept Scheme: Developing a SKOS Vocabulary for Armenian Epigraphic Heritage (Tamrazyan et al., LaTeCH-CLfL 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.latechclfl-1.1.pdf
Supplementarymaterial:
 2026.latechclfl-1.1.SupplementaryMaterial.zip
Supplementarymaterial:
 2026.latechclfl-1.1.SupplementaryMaterial.txt