DiaHClust: an Iterative Hierarchical Clustering Approach for Identifying Stages in Language Change

Christin Schätzle, Hannah Booth


Abstract
Language change is often assessed against a set of pre-determined time periods in order to be able to trace its diachronic trajectory. This is problematic, since a pre-determined periodization might obscure significant developments and lead to false assumptions about the data. Moreover, these time periods can be based on factors which are either arbitrary or non-linguistic, e.g., dividing the corpus data into equidistant stages or taking into account language-external events. Addressing this problem, in this paper we present a data-driven approach to periodization: ‘DiaHClust’. DiaHClust is based on iterative hierarchical clustering and offers a multi-layered perspective on change from text-level to broader time periods. We demonstrate the usefulness of DiaHClust via a case study investigating syntactic change in Icelandic, modelling the syntactic system of the language in terms of vectors of syntactic change.
Anthology ID:
W19-4716
Volume:
Proceedings of the 1st International Workshop on Computational Approaches to Historical Language Change
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Nina Tahmasebi, Lars Borin, Adam Jatowt, Yang Xu
Venue:
LChange
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
126–135
Language:
URL:
https://aclanthology.org/W19-4716
DOI:
10.18653/v1/W19-4716
Bibkey:
Cite (ACL):
Christin Schätzle and Hannah Booth. 2019. DiaHClust: an Iterative Hierarchical Clustering Approach for Identifying Stages in Language Change. In Proceedings of the 1st International Workshop on Computational Approaches to Historical Language Change, pages 126–135, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
DiaHClust: an Iterative Hierarchical Clustering Approach for Identifying Stages in Language Change (Schätzle & Booth, LChange 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-4716.pdf