Santhawat Thanyawong


pdf bib
Written on Leaves or in Stones?: Computational Evidence for the Era of Authorship of Old Thai Prose
Attapol Rutherford | Santhawat Thanyawong
Proceedings of the 1st International Workshop on Computational Approaches to Historical Language Change

We aim to provide computational evidence for the era of authorship of two important old Thai texts: Traiphumikatha and Pumratchatham. The era of authorship of these two books is still an ongoing debate among Thai literature scholars. Analysis of old Thai texts present a challenge for standard natural language processing techniques, due to the lack of corpora necessary for building old Thai word and syllable segmentation. We propose an accurate and interpretable model to classify each segment as one of the three eras of authorship (Sukhothai, Ayuddhya, or Rattanakosin) without sophisticated linguistic preprocessing. Contrary to previous hypotheses, our model suggests that both books were written during the Sukhothai era. Moreover, the second half of the Pumratchtham is uncharacteristic of the Sukhothai era, which may have confounded literary scholars in the past. Further, our model reveals that the most indicative linguistic changes stem from unidirectional grammaticalized words and polyfunctional words, which show up as most dominant features in the model.