Sign Clustering and Topic Extraction in Proto-Elamite

Logan Born, Kate Kelley, Nishant Kambhatla, Carolyn Chen, Anoop Sarkar


Abstract
We describe a first attempt at using techniques from computational linguistics to analyze the undeciphered proto-Elamite script. Using hierarchical clustering, n-gram frequencies, and LDA topic models, we both replicate results obtained by manual decipherment and reveal previously-unobserved relationships between signs. This demonstrates the utility of these techniques as an aid to manual decipherment.
Anthology ID:
W19-2516
Volume:
Proceedings of the 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
Month:
June
Year:
2019
Address:
Minneapolis, USA
Editors:
Beatrice Alex, Stefania Degaetano-Ortlieb, Anna Kazantseva, Nils Reiter, Stan Szpakowicz
Venue:
LaTeCH
SIG:
SIGHUM
Publisher:
Association for Computational Linguistics
Note:
Pages:
122–132
Language:
URL:
https://aclanthology.org/W19-2516
DOI:
10.18653/v1/W19-2516
Bibkey:
Cite (ACL):
Logan Born, Kate Kelley, Nishant Kambhatla, Carolyn Chen, and Anoop Sarkar. 2019. Sign Clustering and Topic Extraction in Proto-Elamite. In Proceedings of the 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, pages 122–132, Minneapolis, USA. Association for Computational Linguistics.
Cite (Informal):
Sign Clustering and Topic Extraction in Proto-Elamite (Born et al., LaTeCH 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-2516.pdf
Code
 sfu-natlang/pe-decipher-toolkit