Kalvin Hartwig


2023

The Ojibwe language has several dialects that vary to some degree in both spoken and written form. We present a method of using support vector machines to classify two different dialects (Eastern and Southwestern Ojibwe) using a very small corpus of text. Classification accuracy at the sentence level is 90% across a five-fold cross validation and 72% when the sentence-trained model is applied to a data set of individual words. Our code and the word level data set are released openly on Github at [link to be inserted for final version, working demonstration notebook uploaded with paper].